Text Mining Differences in Chinese Newspaper Articles

statistics-text_mining_chinese/main-workflow

Author(s)
Daniela Schneider
version Version
1
last_modification Last updated
Feb 28, 2025
license License
CC-BY-4.0
galaxy-tags Tags
Humanities
comparison
text
diff
published

Features
Tutorial
hands_on Text-Mining Differences in Chinese Newspaper Articles

Workflow Testing
Tests: ❌
Results: Not yet automated
FAIRness purl PURL
https://gxy.io/GTN:
RO-Crate logo with flask Download Workflow RO-Crate
Launch in Tutorial Mode question
galaxy-download Download
flowchart TD
  0["ℹ️ Input Dataset\nInput censored text"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nInput uncensored text "];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["Preprocessing of Text one"];
  0 -->|output| 2;
  3["Preprocessing of Text two"];
  1 -->|output| 3;
  4["Comparison with diff - computer version"];
  2 -->|outfile| 4;
  3 -->|outfile| 4;
  5["Comparison with diff - user version"];
  2 -->|outfile| 5;
  3 -->|outfile| 5;
  6["Extracting only censored passages"];
  4 -->|diff_file| 6;
  7["Compute"];
  6 -->|out_file1| 7;
  8["Cut"];
  7 -->|out_file1| 8;
  9["Datamash"];
  7 -->|out_file1| 9;
  10["Generate a word cloud"];
  8 -->|out_file1| 10;
  0e468eb9-b68c-4840-b5e7-12cbf29e0c9c["Output\noutput_graphic"];
  10 --> 0e468eb9-b68c-4840-b5e7-12cbf29e0c9c;
  style 0e468eb9-b68c-4840-b5e7-12cbf29e0c9c stroke:#2c3143,stroke-width:4px;
  11["Sort"];
  9 -->|out_file| 11;
  9dfd8862-e7d7-46ae-bbe2-3e333a017764["Output\noutput_csv"];
  11 --> 9dfd8862-e7d7-46ae-bbe2-3e333a017764;
  style 9dfd8862-e7d7-46ae-bbe2-3e333a017764 stroke:#2c3143,stroke-width:4px;

Inputs

Input Label
Input dataset Input censored text
Input dataset Input uncensored text

Outputs

From Output Label
toolshed.g2.bx.psu.edu/repos/bgruening/wordcloud/wordcloud/1.9.4+galaxy0 Generate a word cloud
sort1 Sort

Tools

Tool Links
Cut1
Filter1
sort1
toolshed.g2.bx.psu.edu/repos/bgruening/diff/diff/3.10+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/wordcloud/wordcloud/1.9.4+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/column_maker/Add_a_column1/2.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.8+galaxy0 View in ToolShed

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands On: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL

Version History

Version Commit Time Comments
1 581b8fc4f 2025-02-28 10:19:49 add tutorial from @Sch-Da

For Admins

Installing the workflow tools

wget https://training.galaxyproject.org/training-material/topics/statistics/tutorials/text_mining_chinese/workflows/main_workflow.ga -O workflow.ga
workflow-to-tools -w workflow.ga -o tools.yaml
shed-tools install -g GALAXY -a API_KEY -t tools.yaml
workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows