Skip to content

Sequence Embedding

Sequence Embedding turns DNA, RNA, or protein sequences into numeric vectors.

Use it for

  • comparing sequences by model representation;
  • creating embedding tables for later analysis;
  • testing local sequence models before building a larger pipeline;
  • feeding embeddings into downstream custom tools.

Inputs

  • FASTA/FA/FNA/FAA/TXT file, or an inline sequence.
  • Molecule type: DNA, RNA, or protein.
  • Max tokens.
  • Compatible installed AI Model.

If both a file and inline sequence are provided, Liatir uses the file.

Compatible models

Outputs

  • Embeddings CSV.
  • JSON summary.
  • Sequence count.
  • Embedding dimension.
  • Mean sequence length.
  • Provenance.

How to read the result

Do not try to interpret every embedding value manually. An embedding is a vector that becomes useful when compared, clustered, visualized, or passed to another tool.

Check:

  • sequence count matches what you expected;
  • embedding dimension is stable for a model;
  • max tokens did not truncate important sequence context;
  • molecule type matches the model.

Technical details

Tool ID: ai-sequence-embedding

Liatir pools model outputs with attention-mask mean pooling and writes a per-sequence embedding table. The output is a representation, not a final biological label.

Liatir — powerful bioinformatics on your machine.

By using this app, you agree to our Privacy Policy and Terms of Service.