Galaxy dataset deletion: Difference between revisions
Huiming Ding (talk | contribs) (New page: == Data deletion == Managing library datasets is a bit complex, so here is a scenario that hopefully provides clarification. The complexities of handling library datasets is mostly contai...) |
Huiming Ding (talk | contribs) |
||
Line 2: | Line 2: | ||
Managing library datasets is a bit complex, so here is a scenario that hopefully provides clarification. The complexities of handling library datasets is mostly contained in the <I>delete_datasets()</I> method in <B>cleanup_datasets.py</B> script. | Managing library datasets is a bit complex, so here is a scenario that hopefully provides clarification. The complexities of handling library datasets is mostly contained in the <I>delete_datasets()</I> method in <B>cleanup_datasets.py</B> script. | ||
Assume we have 1 library dataset with: LibraryDatasetDatasetAssociation -> LibraryDataset and Dataset | Assume we have 1 library dataset with: <B>LibraryDatasetDatasetAssociation</B> -> <B>LibraryDataset</B> and <B>Dataset</B>. At this point, we have the following database column values: | ||
At this point, we have the following database column values: | |||
LibraryDatasetDatasetAssociation deleted: False | LibraryDatasetDatasetAssociation deleted: False | ||
Line 15: | Line 14: | ||
Dataset deleted: False, purged: False | Dataset deleted: False, purged: False | ||
2. After the number of days configured for the delete_datasets() method (option -6 below) have passed, execution of the delete_datasets() method results in the following database column values (changes from previous step marked with *): | 2. After the number of days configured for the <I>delete_datasets()</I> method (option -6 below) have passed, execution of the <I>delete_datasets()</I> method results in the following database column values (changes from previous step marked with *): | ||
LibraryDatasetDatasetAssociation deleted: True* | LibraryDatasetDatasetAssociation deleted: True* | ||
Line 21: | Line 20: | ||
Dataset deleted: True*, purged: False | Dataset deleted: True*, purged: False | ||
3. After the number of days configured for the purge_datasets() method (option -3 below) have passed, execution of the purge_datasets() method results in the following database column values (changes from previous step marked with *): | 3. After the number of days configured for the <I>purge_datasets()</I> method (option -3 below) have passed, execution of the <I>purge_datasets()</I> method results in the following database column values (changes from previous step marked with *): | ||
LibraryDatasetDatasetAssociation deleted: True | LibraryDatasetDatasetAssociation deleted: True | ||
Line 27: | Line 26: | ||
Dataset deleted: True, purged: True* (dataset file removed from disk if -r flag is used) | Dataset deleted: True, purged: True* (dataset file removed from disk if -r flag is used) | ||
This scenario is about as simple as it gets. | This scenario is about as simple as it gets. | ||
Keep in mind that a <B>Dataset</B> object can have many <B>HistoryDatasetAssociations</B> and many <B>LibraryDatasetDatasetAssociations</B>, and a <B>LibraryDataset</B> can have many <B>LibraryDatasetDatasetAssociations</B>. | |||
Another way of stating it is: <B>LibraryDatasetDatasetAssociation</B> objects map <B>LibraryDataset</B> objects to <B>Dataset</B> objects, and <B>Dataset</B> objects may be mapped to <B>History</B> objects via <B>HistoryDatasetAssociation</B> objects. | |||
==References== | ==References== | ||
*The source code: cleanup_datasets.py. | *The source code: cleanup_datasets.py. | ||
*[http://www.mail-archive.com/galaxy-dev@lists.bx.psu.edu/msg01827.html Galaxy mailing list]. | *[http://www.mail-archive.com/galaxy-dev@lists.bx.psu.edu/msg01827.html Galaxy mailing list]. |
Latest revision as of 12:39, 29 July 2011
Data deletion
Managing library datasets is a bit complex, so here is a scenario that hopefully provides clarification. The complexities of handling library datasets is mostly contained in the delete_datasets() method in cleanup_datasets.py script.
Assume we have 1 library dataset with: LibraryDatasetDatasetAssociation -> LibraryDataset and Dataset. At this point, we have the following database column values:
LibraryDatasetDatasetAssociation deleted: False LibraryDataset deleted: False, purged: False Dataset deleted: False purged: False
1. A user deletes the assumed dataset above from a data library via a UI menu option. This action results in the following database column values (changes from previous step marked with *):
LibraryDatasetDatasetAssociation deleted: False LibraryDataset deleted: True*, purged: False Dataset deleted: False, purged: False
2. After the number of days configured for the delete_datasets() method (option -6 below) have passed, execution of the delete_datasets() method results in the following database column values (changes from previous step marked with *):
LibraryDatasetDatasetAssociation deleted: True* LibraryDataset deleted: True, purged: True* Dataset deleted: True*, purged: False
3. After the number of days configured for the purge_datasets() method (option -3 below) have passed, execution of the purge_datasets() method results in the following database column values (changes from previous step marked with *):
LibraryDatasetDatasetAssociation deleted: True LibraryDataset deleted: True, purged: True Dataset deleted: True, purged: True* (dataset file removed from disk if -r flag is used)
This scenario is about as simple as it gets.
Keep in mind that a Dataset object can have many HistoryDatasetAssociations and many LibraryDatasetDatasetAssociations, and a LibraryDataset can have many LibraryDatasetDatasetAssociations.
Another way of stating it is: LibraryDatasetDatasetAssociation objects map LibraryDataset objects to Dataset objects, and Dataset objects may be mapped to History objects via HistoryDatasetAssociation objects.
References
- The source code: cleanup_datasets.py.
- Galaxy mailing list.