FairSubset: A tool to choose representative subsets of data for use with replicates or groups of different sample sizes
Keywords:statistics, normalization, automation, microscopy
High-impact journals are promoting transparency of data. Modern scientific methods can be automated and produce disparate samples sizes. In many cases, it is desirable to retain identical or pre-defined sample sizes between replicates or groups. However, choosing which subset of originally acquired data that best matches the entirety of the data set without introducing bias is not trivial. Here, we released a free online tool, FairSubset, and its constituent Shiny App R code to subset data in an unbiased fashion. Subsets were set at the same N across samples and retained representative average and standard deviation information. The method can be used for quantitation of entire fields of view or other replicates without biasing the data pool toward large N samples. We showed examples of the tool’s use with fluorescence data and DNA-damage related Comet tail quantitation. This FairSubset tool and the method to retain distribution information at the single-datum level may be considered for standardized use in fair publishing practices.
Jones W. Longevity in a fasting spider. Science. 1884;3(48):4. Epub 1884/01/04. doi: 10.1126/science.ns-3.48.4-c. PubMed PMID: 17738099.
Lee JY, Kitaoka M. A beginner's guide to rigor and reproducibility in fluorescence imaging experiments. Mol Biol Cell. 2018;29(13):1519-25. Epub 2018/06/29. doi: 10.1091/mbc.E17-05-0276. PubMed PMID: 29953344; PubMed Central PMCID: PMCPMC6080651.
Ljosa V, Carpenter AE. Introduction to the quantitative analysis of two-dimensional fluorescence microscopy images for cell-based screening. PLoS Comput Biol. 2009;5(12):e1000603. Epub 2009/12/31. doi: 10.1371/journal.pcbi.1000603. PubMed PMID: 20041172; PubMed Central PMCID: PMCPMC2791844.
Weissgerber TL, Milic NM, Winham SJ, Garovic VD. Beyond bar and line graphs: time for a new data presentation paradigm. PLoS Biol. 2015;13(4):e1002128. doi: 10.1371/journal.pbio.1002128. PubMed PMID: 25901488; PubMed Central PMCID: PMCPMC4406565.
Kick the bar chart habit. Nat Methods. 2014;11(2):113. PubMed PMID: 24645190.
Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods. 2012;9(7):671-5. PubMed PMID: 22930834; PubMed Central PMCID: PMCPMC5554542.
Ghasemi A, Zahediasl S. Normality tests for statistical analysis: a guide for non-statisticians. Int J Endocrinol Metab. 2012;10(2):486-9. doi: 10.5812/ijem.3505. PubMed PMID: 23843808; PubMed Central PMCID: PMCPMC3693611.
Gyori BM, Venkatachalam G, Thiagarajan PS, Hsu D, Clement MV. OpenComet: an automated tool for comet assay image analysis. Redox Biol. 2014;2:457-65. doi: 10.1016/j.redox.2013.12.020. PubMed PMID: 24624335; PubMed Central PMCID: PMCPMC3949099.
Delaney JR, Patel CB, Willis KM, Haghighiabyaneh M, Axelrod J, Tancioni I, et al. Haploinsufficiency networks identify targetable patterns of allelic deficiency in low mutation ovarian cancer. Nat Commun. 2017;8:14423. doi: 10.1038/ncomms14423. PubMed PMID: 28198375; PubMed Central PMCID: PMCPMC5316854.
Data sharing and the future of science. Nat Commun. 2018;9(1):2817. doi: 10.1038/s41467-018-05227-z. PubMed PMID: 30026584; PubMed Central PMCID: PMCPMC6053389.
Guo Y, Logan HL, Glueck DH, Muller KE. Selecting a sample size for studies with repeated measures. BMC Med Res Methodol. 2013;13:100. doi: 10.1186/1471-2288-13-100. PubMed PMID: 23902644; PubMed Central PMCID: PMCPMC3734029.
How to Cite
Authors who publish with JBM agree to the following terms:
- Authors retain copyright and grant JBM right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).