Understanding input error messages

Input problems are very common across any analysis that makes use of programmed tools.

  • Causes:
    • No quality assurance or content/formatting checks were run on the first datasets of an analysis workflow.
    • Incomplete dataset Upload.
    • Incorrect or unassigned datatype or database.
    • Tool-specific formatting requirements for inputs were not met.
    • Parameters set on a tool form are a mismatch for the input data content or format.
    • Inputs were in an error state (red) or were putatively successful (green) but are empty.
    • Inputs do not meet the datatype specification.
    • Inputs do not contain the exact content that a tool is expecting or that was input in the form.
    • Annotation files are a mismatch for the selected or assigned reference genome build.
    • Special case: Some of the data were generated outside of Galaxy, but later a built-in indexed genome build was assigned in Galaxy for use with downstream tools. This scenario can work, but only if those two reference genomes are an exact match.
  • Solutions:
    • Review our Troubleshooting Tips for what and where to check.
    • Review the GTN for related tutorials on tools/analysis plus FAQs.
    • Review Galaxy Help for prior discussion with extended solutions.
    • Review datatype FAQs.
    • Review the tool form.
      • Input selection areas include usage help.
      • The help section at the bottom of a tool form often has examples. Does your own data match the format/content?
      • See the links to publications and related resources.
    • Review the inputs.
      • All inputs must be in a success state (green) and actually contain content.
      • Did you directly assign the datatype or convert the datatype? What results when the datatype is detected by Galaxy? If these differ, there is likely a content problem.
      • For most analysis, allowing Galaxy to detect the datatype during Upload is best and adjusting a datatype later should rarely be needed. If a datatype is modified, the change has a specific purpose/reason.
      • Does your data have headers? Is that in specification for the datatype? Does the tool form have an option to specify if the input has headers or not? Do you need to remove headers first for the correct datatype to be detected? Example GTF.
      • Large inputs? Consider modifying your inputs to be smaller. Examples: FASTQ and FASTA.
    • Run quality checks on your data.
      • Search GTN tutorials with the keyword “qa-qc” for examples.
      • Search Galaxy Help with the keywords “qa-qc” and your datatype(s) for more help.
    • Reference annotation tips.
    • Input mismatch tips.
      • Do the chromosome/sequence identifiers exactly match between all inputs? Search Galaxy Help for more help about how to correct build/version identifier mismatches between inputs.
      • “Chr1” and “chr1” and “1” do not mean the same thing to a tool.
    • Custom genome transcriptome exome tips. See FASTA.
Still have questions?
Gitter Chat Support
Galaxy Help Forum
Want to embed this snippet (FAQ) in your GTN Tutorial?
{% snippet  faqs/galaxy/analysis_job_failure_input_problem.md %}