AlcoR

AlcoR

AlcoR provides the ability of fast sequence characterization through low-complexity regions, ideally for scenarios entangling the presence of new or unknown sequences. AlcoR offers important analysis advantages, such as the high-sensitivity, speed, and does not provide false positives, ideally to be applied in the current Telomere-to-telomere (T2T) sequencing and assembly methodologies. AlcoR is implemented in C language using multi-threading to increase the computational speed, is flexible for multiple applications, and does not contain external dependencies. The tool accepts any sequence in FASTA format.

The AlcoR tool contains one main menu (command: AlcoR) with the five sub menus for computing the features that it provides, namely

  • info: it provides information of the length and GC percentage for each FASTA read;
  • extract: extracts a sequence of a FASTA file using positional coordinates (independent from the existing headers of the FASTA files);
    mapper: computes the low-complexity regions of a FASTA read while providing bidirectional complexity profiles and further structural similarity analysis;
  • simulation: FASTA sequence simulation with features: file extraction, random generation, sequence modeling. Additionally, it allwos to apply specific SNPs probability mutations;
  • visual: computes an SVG file with the respective map containing the low-complexity regions.
Cite as

J. M. Silva, W. Qi, A. J. Pinho, D. Pratas, AlcoR: alignment-free simulation, mapping, and visualization of low-complexity regions in biological data, GigaScience, Volume 12, 2023. https://doi.org/10.1093/gigascience/giad101