Introduction

Fly larvae were collected from a forensic death investigation scene and submitted for molecular identification. Due to the sensitive nature of the case, specific details regarding the circumstances and location cannot be disclosed. We used Oxford Nanopore sequencing to amplify and sequence the COI (cytochrome c oxidase subunit I) gene for molecular species identification. This mitochondrial barcoding region is widely used for insect identification and provides rapid, accurate species-level resolution for forensic entomology applications.

Table of Contents

Methods Results Summary Statistics Sequencing Performance Quality Metrics Species Identification BOLD Identifications BLAST Identifications Reliability & Geography COI Limitations Consensus IDs Discussion References

Methods

Sample Collection: Fly larvae were collected from a human decomposition case and preserved for molecular analysis. Specific case details are confidential and cannot be disclosed.

DNA Extraction & Amplification: COI gene region was PCR amplified using universal insect primers (Wells & Sperling, 2001):

  • C1-J-1751-F (forward): 5'-ACA CTG ACG ACA TGG TTC TAC AGG ATC ACC TGA TAT AGC ATT CCC-3'
  • TL2-N-3014-R (reverse): 5'-TAC GGT AGC AGA GAC TTG GTC TCG AGG TAT TCC AGC AAG TCC-3'
1,539
COI length (bp)
1,264
Amplicon (bp)
82%
COI coverage

Primer coordinates based on Drosophila yakuba mtDNA reference numbering (Simon et al. 1994). C1-J-1751 binds 212 bp from the start of COI; TL2-N-3014 binds 63 bp from the 3′ end. Amplicon spans the full diagnostic barcode region used for Diptera species identification.

Nanopore Sequencing: Samples were prepared using the Rapid Barcoding Kit 96 V14 (Oxford Nanopore Technologies) and sequenced on an Oxford Nanopore MinION using a Flongle flow cell.

Bioinformatics Pipeline:

  1. Basecalling and demultiplexing: Raw fast5 files were basecalled and demultiplexed using Oxford Nanopore Guppy basecaller
  2. Quality filtering: Top 20 reads (by quality score) per barcode were selected for consensus generation
  3. Consensus sequence generation: Reads aligned using MAFFT with majority-rule consensus calling
  4. BOLD identification: Consensus sequences queried against BOLD Systems using BOLDigger3 v4 (public database, comprehensive mode)
  5. BLAST search: Consensus sequences queried against NCBI nucleotide (nt) database using blastn
  6. Dual-database validation: BOLD and BLAST results cross-validated to detect conflicts and assess confidence

Consensus Sequence Generation:

After basecalling and demultiplexing, the top 20 reads (by quality score) from each barcode were used to build consensus sequences. This approach balances sequencing depth with computational efficiency while minimizing the impact of sequencing errors.

The consensus building pipeline performs the following steps:

  1. Multiple sequence alignment using MAFFT with automatic algorithm selection and directional correction
  2. Majority-rule consensus calling where each position is assigned the most common base across aligned reads
  3. Gap removal to produce the final consensus sequence
  4. Single-read handling for barcodes with insufficient read depth
Click to view consensus_builder.py
import os
import subprocess
from collections import Counter

def mafft_align(fasta_path, out_path):
    subprocess.run([
        "mafft", "--auto", "--adjustdirectionaccurately", "--quiet",
        fasta_path
    ], stdout=open(out_path, "w"), stderr=subprocess.DEVNULL)

def majority_consensus(aligned_fasta):
    seqs = []
    current = []
    with open(aligned_fasta) as f:
        for line in f:
            line = line.strip()
            if line.startswith(">"):
                if current:
                    seqs.append("".join(current).upper())
                current = []
            else:
                current.append(line)
        if current:
            seqs.append("".join(current).upper())

    if not seqs:
        return None
    if len(seqs) == 1:
        return seqs[0].replace("-", "")

    consensus = []
    for i in range(len(seqs[0])):
        col = [s[i] for s in seqs if i < len(s)]
        counts = Counter(b for b in col if b != "-")
        if counts:
            consensus.append(counts.most_common(1)[0][0])
    return "".join(consensus)

fasta_dir = "filtered_reads"
tmp_dir = "tmp_alignments"
os.makedirs(tmp_dir, exist_ok=True)

output_fasta = "consensus_sequences.fasta"
skipped = 0
written = 0

with open(output_fasta, "w") as out:
    for fname in sorted(os.listdir(fasta_dir)):
        if not fname.endswith(".fasta"):
            continue
        barcode = fname.replace("_top20.fasta", "")
        fpath = os.path.join(fasta_dir, fname)

        if os.path.getsize(fpath) == 0:
            skipped += 1
            continue

        n_seqs = sum(1 for l in open(fpath) if l.startswith(">"))
        if n_seqs < 2:
            # Only one read, just use it directly
            with open(fpath) as f:
                lines = f.readlines()
            out.write(f">{barcode}\n")
            out.write("".join(l for l in lines[1:] if not l.startswith(">")))
            written += 1
            print(f"{barcode}: single read, used directly")
            continue

        aligned = os.path.join(tmp_dir, f"{barcode}_aligned.fasta")
        mafft_align(fpath, aligned)
        consensus = majority_consensus(aligned)

        if consensus:
            out.write(f">{barcode}\n{consensus}\n")
            written += 1
            print(f"{barcode}: consensus from {n_seqs} reads ({len(consensus)} bp)")
        else:
            skipped += 1
            print(f"{barcode}: failed")

print(f"\nDone: {written} consensus sequences written, {skipped} skipped")
print(f"Output: {output_fasta}")

Results

Summary Statistics

Table 1: Sequencing run metrics for MinION nanopore sequencing using a Flongle flow cell with 96 barcoded samples.

Metric Value
Total Barcodes 96
Barcodes with Data 87 (90.6%)
Sequencing Time ~24 hours
Total Reads >100,000

Table 2: Taxonomic identification performance comparing BOLD Systems and NCBI BLAST databases. Agreement between databases indicates reliable identification; conflicts highlight COI limitations and database gaps.

Metric BOLD BLAST
Samples Analyzed 87 77 (9 BOLD-only)
High Confidence ~50 species-level 23 species agree + 7 ambiguous
Moderate Confidence ~15 genus-level 9 genus agree
Low Confidence ~15 family-level 11 conflicts + unresolved
Failed IDs ~7 samples 12 samples
Agreement 32/77 (42%) agree with BOLD
Most Common Species Lucilia coeruleiviridis Lucilia coeruleiviridis

Sequencing Performance

Figure 1: Cumulative read count over the ~24 hour sequencing run. The MinION generated >100,000 quality-passed reads across the Flongle flow cell.

Figure 2: Read count distribution across 96 barcodes, ranked by abundance. The red dashed line indicates the minimum threshold (100 reads) for reliable consensus generation.

Sequence Quality Metrics

Consensus sequences averaged ~1,300 bp in length, covering the full COI barcode region (658 bp) plus flanking primer regions. Most samples showed >97% identity to reference sequences.

Figure 3: Consensus sequence length distribution. Most sequences exceeded the 658 bp COI barcode region, capturing flanking primer sequences.

Figure 4: Percent identity to reference sequences from NCBI BLAST searches. The majority of samples exceeded 97% identity, indicating high-quality species-level matches.

Figure 5: Read depth used for consensus sequence generation. Most samples utilized the maximum depth of 20 reads for optimal consensus quality.

Species Identification Overview

Consensus sequences were queried against two complementary reference databases for taxonomic identification: BOLD Systems (Barcode of Life Data System) and NCBI BLAST (Basic Local Alignment Search Tool). Each database has distinct strengths and limitations for insect identification.

BOLD Identifications

BOLD Systems is a specialized database for DNA barcoding with curated COI sequences and taxonomic assignments. The majority of samples (>60%) achieved species-level identification through BOLD, with varying confidence levels based on sequence quality and reference database matches.

Consensus sequences were queried using the BOLD Identification Engine v4 (BOLDigger3) command-line tool:

boldigger3 identify consensus_sequences.fasta --db 1 --mode 3

This command queries the public BOLD database (–db 1) using the comprehensive identification mode (–mode 3), which returns top matches with similarity scores and taxonomic assignments.

Figure 6: BOLD identification success by taxonomic level. Most samples (>60%) achieved species-level identification, while others were resolved to genus or family level.

Figure 7: BOLD confidence levels for taxonomic identifications. High-confidence identifications (>97% sequence identity) were achieved for the majority of samples.

Figure 8: Family-level composition of identified larvae. Calliphoridae (blow flies) dominated the samples, consistent with their role as primary colonizers of carrion.

Figure 9: Species-level identifications among samples that achieved species-level resolution. Lucilia coeruleiviridis was the most abundant species detected.

BLAST Identifications

NCBI BLAST provides access to a broader genomic database including GenBank submissions. While less specialized for barcoding, BLAST can identify sequences missed by BOLD and provides alternative taxonomic perspectives. Consensus sequences were queried against the NCBI nucleotide (nt) database using the command-line blastn tool:

blastn -query consensus_sequences.fasta -db nt -remote \
  -outfmt "6 qseqid sseqid stitle pident length evalue bitscore" \
  -max_target_seqs 5 -num_threads 4 > blast_results.tsv

This command queries the remote NCBI nt database, returns the top 5 hits per sequence in tabular format, and includes percent identity, alignment length, e-value, and bit score for downstream filtering.

Figure 10: BLAST verdict categories for 77 samples with NCBI data. Agreement with BOLD at species or genus level was achieved in 42% of samples, while conflicts and ambiguous cases highlight COI limitations.

Figure 11: Final confidence distribution combining BOLD and BLAST results. High-confidence identifications (35%) required database agreement; low-confidence and failed cases reflect conflicts or poor sequence quality.

Figure 12: Database agreement breakdown showing the relationship between BOLD and BLAST results. Agreement (37%) validates identifications, while conflicts (13%) highlight database gaps and COI limitations.

Figure 13: Final species identifications after integrating BOLD and BLAST results. Lucilia coeruleiviridis dominated, with several ambiguous species pairs and genus-level assignments reflecting COI limitations.

Key Findings:

  • BLAST confirmed the majority of BOLD identifications for Lucilia coeruleiviridis and Phormia regina
  • Several samples showed species-level conflicts, particularly within the Lucilia genus where COI alone cannot reliably distinguish certain species pairs
  • BLAST identified non-Dipteran contaminants in 12 samples (bacteria: Enterococcus, Ignatzschineria, Photobacterium, Pseudomonas), highlighting the importance of dual-database validation

Identification Reliability and Geographic Considerations

BOLD Database Issues

While the majority of samples achieved high-confidence species-level identifications, several BOLD assignments warrant caution due to their biogeographic incongruence with the sampling location (the northern United States):

Table 3: Biogeographically implausible BOLD identifications. Species identified are either geographically restricted to Old World (Europe, Australia) or Neotropical regions, making their occurrence in the northern United States highly unlikely and suggesting database reference gaps.

Barcode Species Identified % Identity Geographic Range Issue
barcode74 Lucilia pulverulenta 98.19% Europe, Australia No established North American presence
barcode92 Lucilia mexicana Mexico, Central America Only 1 BOLD record; very unlikely in the northern United States
barcode30 Lucilia eximia Neotropical Likely misidentified L. coeruleiviridis
barcode34 Lucilia eximia Neotropical Likely misidentified L. coeruleiviridis
barcode64 Lucilia eximia Neotropical Likely misidentified L. coeruleiviridis
barcode69 Lucilia eximia Neotropical Likely misidentified L. coeruleiviridis
barcode85 Lucilia eximia Neotropical Likely misidentified L. coeruleiviridis

Lucilia pulverulenta (barcode74) — This is primarily an Old World species found in Europe and Australia with essentially no established presence in North America. A 98.19% identity match in the northern United States is almost certainly a misidentification, likely representing a Lucilia species whose COI sequence isn’t well-represented in BOLD.

Lucilia mexicana (barcode92) — Primarily a Mexican/Central American species. While possible, this species is very unlikely in the northern United States. The identification was supported by only 1 record in BOLD, which is a red flag for a spurious hit indicating insufficient reference data.

Lucilia eximia (barcodes 30, 34, 64, 69, 85) — Primarily Neotropical (South/Central America). Finding 5 individuals of this species in the northern United States would be remarkable and almost certainly reflects a reference database gap rather than genuine identifications. L. eximia and L. coeruleiviridis have notoriously similar COI sequences, and BOLD’s coverage of North American Lucilia is patchy enough that this kind of confusion is common.

BLAST Database Issues

While BLAST provided valuable cross-validation of BOLD results, several identification conflicts and database-specific issues emerged:

Table 4: BOLD and BLAST identification conflicts. Discordant species assignments highlight COI limitations for closely related species and database-specific biases in reference coverage.

Barcode BOLD ID BLAST ID % Identity Issue
barcode31 Lucilia coeruleiviridis Lucilia pulverulenta 94.68% Conflict: BLAST suggests Old World species unlikely in the northern United States
barcode35 Lucilia retroversa Lucilia coeruleiviridis 80.21% Conflict: Low BLAST identity; species determination uncertain
barcode50 Lucilia retroversa Lucilia coeruleiviridis 81.92% Conflict: Both identifications plausible; COI insufficient for resolution
barcode74 Lucilia pulverulenta Lucilia coeruleiviridis 84.73% Conflict: BLAST favors L. coeruleiviridis (more likely geographically)
barcode30 Lucilia eximia Lucilia mexicana 82.15% Conflict: Both Neotropical; poor BLAST identity suggests neither correct
barcode34 Lucilia eximia Lucilia mexicana 87.48% Conflict: Neotropical species unlikely; low confidence overall
barcode69 Lucilia eximia Lucilia coeruleiviridis 87.86% Conflict: BLAST supports North American species
barcode85 Lucilia eximia Lucilia mexicana 92.01% Conflict: Neotropical assignments in the northern United States questionable

Conflict Patterns:

1. Lucilia retroversa vs. L. coeruleiviridis conflicts (barcodes 35, 50) — BOLD identified these as L. retroversa, but BLAST matched L. coeruleiviridis with low identity (80-82%). Both species occur in North America, but the low BLAST identities suggest possible database gaps for L. retroversa in GenBank.

2. Lucilia pulverulenta mismatches (barcodes 31, 74) — BOLD assigned L. pulverulenta (an Old World species), while BLAST matched L. coeruleiviridis (North American). BLAST results appear more biogeographically plausible, suggesting BOLD’s L. pulverulenta references may be contaminating North American identifications.

3. Neotropical Lucilia conflicts (barcodes 30, 34, 69, 85) — BOLD identified several samples as L. eximia (Neotropical), while BLAST returned L. mexicana or L. coeruleiviridis. Given the the northern United States collection site, all these identifications are suspect. The conflicts likely reflect incomplete COI reference coverage for Nearctic Lucilia species in both databases.

4. Low BLAST percent identities — Many samples showed 80-90% BLAST identity despite high BOLD matches (>97%). This discrepancy suggests:

  • BOLD’s curated COI barcode database is more complete for North American blow flies
  • NCBI GenBank contains many partial or lower-quality COI sequences
  • Geographic sampling biases in GenBank favor European and Asian specimens

5. Non-Dipteran contaminants — BLAST successfully identified 12 bacterial contaminants (Enterococcus, Ignatzschineria, Photobacterium, Pseudomonas) that BOLD could not classify, demonstrating BLAST’s utility for detecting non-target DNA.

The L. coeruleiviridis / mexicana Problem

Seven samples in our dataset were flagged as “ambiguous” due to indistinguishable COI sequences between Lucilia coeruleiviridis and L. mexicana (barcodes 03, 11, 18, 26, 59, 66, 94). This is not a database error but a well-documented biological limitation of COI barcoding for this species pair.

DeBry et al. (2012) Study:

DeBry et al. conducted a comprehensive COI barcoding study of continental U.S. Lucilia species, assembling ~1,100 bp COI sequences from 122 specimens representing 9 of the 10 U.S. species. Their key findings:

  1. Monophyly Test: They defined a species as “DNA-identifiable” if it formed an exclusively monophyletic clade in >95% of bootstrap pseudoreplicates in COI phylogenies.

  2. Seven Species Passed: Most Lucilia species (including L. illustris, L. sericata, L. cuprina) formed well-supported monophyletic groups separable by COI alone.

  3. L. coeruleiviridis and L. mexicana Failed: These two species share COI haplotypes and do not form exclusive, separable clades. As sampled in the continental U.S., they are indistinguishable using mitochondrial COI alone.

Our seven ambiguous identifications align perfectly with DeBry et al.’s findings. Where BOLD assigned L. coeruleiviridis and BLAST returned L. mexicana (or vice versa), we marked these as Lucilia coeruleiviridis / mexicana with high confidence for the species pair, but low confidence for distinguishing between them. Given the the northern United States collection site, L. coeruleiviridis is more biogeographically likely, but COI data alone cannot definitively rule out L. mexicana.

Consensus Identification Table

The following table presents our best interpretation of each sample’s identity after integrating BOLD and BLAST results with biogeographic assessment. Identifications flagged as L. mexicana have been corrected to L. coeruleiviridis / mexicana (acknowledging COI indistinguishability) or reassigned to Lucilia sp. where conflicts render species-level assignment unreliable.

Table 5: Consensus species identifications for all samples after integrating BOLD, BLAST, and biogeographic assessment. Identifications apply corrections for COI limitations (L. coeruleiviridis/mexicana indistinguishability) and biogeographically implausible assignments.

Barcode Consensus Identification Confidence Notes
barcode01 Lucilia sp. Moderate
barcode02 Lucilia coeruleiviridis High
barcode03 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode04 Lucilia coeruleiviridis High
barcode05 Failed Failed Non-Dipteran contaminant or poor quality
barcode06 Lucilia coeruleiviridis Low
barcode07 Lucilia coeruleiviridis / mexicana Low COI indistinguishable; likely L. coeruleiviridis (biogeography)
barcode09 Lucilia coeruleiviridis / mexicana Low COI indistinguishable; likely L. coeruleiviridis (biogeography)
barcode10 Lucilia coeruleiviridis High
barcode11 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode12 Failed Failed Non-Dipteran contaminant or poor quality
barcode13 Failed Failed Non-Dipteran contaminant or poor quality
barcode14 Failed Failed Non-Dipteran contaminant or poor quality
barcode15 Lucilia coeruleiviridis High
barcode16 Lucilia coeruleiviridis High
barcode17 Phormia regina High
barcode18 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode19 Lucilia coeruleiviridis High
barcode20 Failed Failed Non-Dipteran contaminant or poor quality
barcode21 Lucilia coeruleiviridis High
barcode22 Lucilia coeruleiviridis Low BLAST conflict with unidentified specimen; BOLD ID retained
barcode23 Lucilia coeruleiviridis High
barcode25 Failed Failed Non-Dipteran contaminant or poor quality
barcode26 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode27 Lucilia coeruleiviridis High
barcode28 No match Moderate No reliable database hit
barcode29 Lucilia sp. Moderate
barcode30 Lucilia sp. Low Conflict; biogeographically implausible IDs
barcode31 Lucilia sp. Low Database conflict
barcode33 Lucilia vulgata Low European species; ID uncertain
barcode34 Lucilia sp. Low Conflict; biogeographically implausible IDs
barcode35 Lucilia retroversa / coeruleiviridis Low Both Nearctic; COI insufficient for distinction
barcode36 Lucilia sp. Low Neotropical ID implausible; likely misidentified L. coeruleiviridis
barcode37 Failed Failed Non-Dipteran contaminant or poor quality
barcode38 Lucilia coeruleiviridis High
barcode39 Lucilia coeruleiviridis High
barcode41 Lucilia coeruleiviridis High
barcode42 Lucilia sp. Moderate
barcode43 Lucilia coeruleiviridis High
barcode44 Lucilia sp. Moderate
barcode45 Phormia regina High
barcode46 Lucilia coeruleiviridis Moderate
barcode47 Lucilia coeruleiviridis Moderate
barcode48 Lucilia sp. Moderate
barcode49 Lucilia coeruleiviridis Low
barcode50 Lucilia retroversa / coeruleiviridis Low Both Nearctic; COI insufficient for distinction
barcode51 Failed Failed Non-Dipteran contaminant or poor quality
barcode52 Lucilia sp. Low Old World species implausible in the northern United States
barcode53 Lucilia sp. Moderate
barcode54 No match Moderate No reliable database hit
barcode55 Lucilia illustris High
barcode57 Lucilia coeruleiviridis High
barcode58 Lucilia sp. Moderate
barcode59 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode60 Lucilia coeruleiviridis High
barcode61 Lucilia coeruleiviridis Moderate
barcode62 Lucilia coeruleiviridis Low
barcode63 Lucilia coeruleiviridis Moderate
barcode64 Lucilia sp. Moderate Neotropical ID implausible; likely misidentified L. coeruleiviridis
barcode65 Failed Failed Non-Dipteran contaminant or poor quality
barcode66 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair
barcode67 Lucilia sp. Moderate
barcode68 No match Moderate No reliable database hit
barcode69 Lucilia sp. Low Conflict; biogeographically implausible IDs
barcode70 Failed Failed Non-Dipteran contaminant or poor quality
barcode72 Lucilia sp. Moderate
barcode73 Lucilia coeruleiviridis Low
barcode74 Lucilia coeruleiviridis Low L. pulverulenta (Old World) implausible; BLAST ID retained
barcode75 Lucilia coeruleiviridis High
barcode76 Failed Failed Non-Dipteran contaminant or poor quality
barcode77 Lucilia coeruleiviridis High
barcode78 Lucilia coeruleiviridis High
barcode79 Phormia regina Low
barcode80 Lucilia coeruleiviridis Low BLAST conflict with unidentified specimen; BOLD ID retained
barcode81 Lucilia sp. Low Insufficient data for species-level ID
barcode82 Lucilia coeruleiviridis / mexicana Low COI indistinguishable; likely L. coeruleiviridis (biogeography)
barcode83 Lucilia coeruleiviridis Low
barcode84 Lucilia coeruleiviridis Low BLAST conflict with unidentified specimen; BOLD ID retained
barcode85 Lucilia sp. Low Conflict; biogeographically implausible IDs
barcode86 Lucilia coeruleiviridis High
barcode89 Lucilia coeruleiviridis / mexicana Low COI indistinguishable; likely L. coeruleiviridis (biogeography)
barcode90 Lucilia sp. Low Neotropical ID implausible; likely misidentified L. coeruleiviridis
barcode91 Lucilia coeruleiviridis Moderate
barcode92 Lucilia coeruleiviridis / mexicana High COI indistinguishable; likely L. coeruleiviridis (biogeography)
barcode93 Failed Failed Non-Dipteran contaminant or poor quality
barcode94 Lucilia coeruleiviridis / mexicana High COI cannot distinguish species pair

Summary:

  • 29 samples identified as Lucilia coeruleiviridis with high confidence
  • 10 samples flagged as L. coeruleiviridis / mexicana (COI indistinguishable species pair)
  • 2 samples identified as Phormia regina
  • 1 sample identified as Lucilia illustris
  • 2 samples as L. retroversa / coeruleiviridis (uncertain)
  • 22 samples assigned to Lucilia sp. (genus-level only)
  • 12 samples failed (non-Dipteran contaminants)
  • 4 samples with no reliable match

Discussion

Our nanopore sequencing approach, combined with dual-database validation (BOLD and BLAST), successfully identified fly larvae with varying confidence levels. Of 87 samples, 30 (35%) achieved high-confidence identifications where both databases agreed, while 26 (30%) remained low-confidence due to conflicts or insufficient reference data. This dual-database strategy proved essential for:

  1. Cross-validation: 42% of samples showed agreement between BOLD and BLAST, providing confidence in species assignments
  2. Conflict detection: 11 samples revealed species-level disagreements, highlighting regions where COI alone is insufficient
  3. Contaminant identification: BLAST detected 12 bacterial contaminants missed by BOLD’s arthropod-focused database
  4. Database bias assessment: Comparison revealed BOLD’s superior coverage for North American blow flies, while BLAST provided broader taxonomic scope

The blow fly family Calliphoridae dominated the samples, with Lucilia coeruleiviridis being the most abundant species (29 samples, 34%). This species is a primary colonizer of carrion commonly encountered in forensic investigations.

Several BOLD identifications of Neotropical species (L. eximia, L. mexicana) were contradicted or poorly supported by BLAST, highlighting database gaps for Nearctic Lucilia species. These conflicts likely represent misidentifications due to incomplete reference coverage rather than genuine biogeographic anomalies.

Technical Limitations

COI Gene Limitations: Seven samples showed L. coeruleiviridis / L. mexicana ambiguity, representing a known COI limitation where these species pairs share nearly identical barcode sequences. Additional genetic markers (e.g., CAD, ITS2) would be required for definitive separation.

Database Completeness: Low-confidence identifications and conflicts between databases underscore the critical dependence on reference sequence availability. For North American forensic entomology applications, BOLD’s curated COI database outperformed NCBI GenBank, which contains many partial or geographically biased sequences.

Failed Identifications: Twelve samples (14%) failed both databases, attributable to:

  • Bacterial DNA contamination from decomposition microbiome
  • Low read counts producing poor-quality consensus sequences
  • Non-target arthropod DNA (e.g., mites, parasitoids)
  • Sequences from species absent from both reference databases

References

Abeynayake, S. W., Fiorito, S., Dinsdale, A., Whattam, M., Crowe, B., Sparks, K., Campbell, P. R., & Gambley, C. (2021). A Rapid and Cost-Effective Identification of Invertebrate Pests at the Borders Using MinION Sequencing of DNA Barcodes. Genes, 12(8), 1138. https://doi.org/10.3390/genes12081138

Boehme, P., Amendt, J., & Zehner, R. (2011). The use of COI barcodes for molecular identification of forensically important fly species in Germany. Parasitology Research, 110(6), 2325–2332. https://doi.org/10.1007/s00436-011-2767-8

DeBry, R. W., Timm, A., Wong, E. S., Stamper, T., Cookman, C., & Dahlem, G. A. (2012). DNA-Based Identification of Forensically Important Lucilia (Diptera: Calliphoridae) in the Continental United States. Journal of Forensic Sciences, 58(1), 73–78. https://doi.org/10.1111/j.1556-4029.2012.02176.x

Sandoval-Arias, S., Morales-Montero, R., Araya-Valcerde, E., & Hernández-Calvajal, E. (2020). Identificación molecular mediante código de barras de DNA de moscas Lucilia (Diptera: Calliphoridae) recolectadas en Costa Rica. Revista Tecnología En Marcha, 33(1). https://doi.org/10.18845/tm.v33i1.5025

Srivathsan, A., Baloğlu, B., Wang, W., Tan, W. X., Bertrand, D., Ng, A. H. Q., Boey, E. J. H., Koh, J. J. Y., Nagarajan, N., & Meier, R. (2018). A MinION™-based pipeline for fast and cost-effective DNA barcoding. Molecular Ecology Resources, 18(5), 1035–1049. https://doi.org/10.1111/1755-0998.12890

Wells, J. D., & Sperling, F. A. H. (2001). DNA-based identification of forensically important Chrysomyinae (Diptera: Calliphoridae). Forensic Science International, 120(1-2), 110–115. https://doi.org/10.1016/s0379-0738(01)00414-5

Yusseff-Vanegas, S. Z., & Agnarsson, I. (2017). DNA-barcoding of forensically important blow flies (Diptera: Calliphoridae) in the Caribbean Region. PeerJ, 5, e3516. https://doi.org/10.7717/peerj.3516