Why do we do dimension reduction and then clustering? Why not just cluster on the actual data?

Within the Galaxy framework we recommend the use of Unipept software that uses UniProt databases and annotation to detect proteins (EC terms) and functional groups such as GO Ontology and InterPro terms. Other software tools such as EggNOG Mapper are also available within the Galaxy platform. Other software such as MEGAN5, MetaGOmics, MetaProteomeAnalyzer (MPA), ProPHAnE also generate functional outputs.