Skip to content

Conversation

@CalvinFang-code
Copy link
Collaborator

I've made these improvements, hoping they will be useful:

  1. Added the _join_comparative_analyses function to _build_metadata_table to incorporate comparative datasets; _join_comparative_analyses queries the comparative dataset using SQL and then prepares for matching; _parse_composite_identifier: parses the ID from the comparative dataset for matching.
  2. It worked successfully with Harbison, but failed with Hackett due to a case mismatch (uppercase H). Therefore, I added code to _join_comparative_analyses to try both uppercase and lowercase beginnings for the repo ID.
  3. Added a query_dto function to specifically handle the intersection of specified binding and perturbation datasets.
  4. I also found some inconsistencies between datasets: BrentLab/harbison_2004;harbison_2004;3
    BrentLab/rossi_2021/rossi_2021_af_combined
    Some use semicolons, others use slashes.

Some use uppercase, others use lowercase:
BrentLab/Hackett_2020;hackett_2020;34 BrentLab/harbison_2004;harbison_2004;3

Do we need to unify them? Or should we handle them separately with functions?
5. I failed to read the calling cards data; the program crashed several times, but I haven't found the reason yet, so I haven't continued with the analysis.

@CalvinFang-code
Copy link
Collaborator Author

I find this strange; this problem didn't occur in my local testing. I'll investigate what's causing this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant