DataONE:Notebook/Summer 2010/2010/07/01 chat

In the chat room: Heather Piwowar (hpiwowar@gmail.com), Nicholas Weber (nicholas.m.weber@gmail.com) 9:00 AM Heather: You've been invited to this chat room! Valerie has joined Heather: Good morning all! Nicholas: Hi 9:01 AM Sarah: hello Valerie has left 9:02 AM Heather: Whoops we lost Valerie 9:03 AM Valerie has joined Valerie: hello Heather: Hi Valerie Hope you've all had a good week? The Evolution meeting was very interesting. Lots of learning and new contacts. Valerie: yes, it finally cooled down here/I finally fixed the reboot error on my computer neat 9:04 AM Nicholas: Heather how big was the conference ? in terms of attendees I suppose Heather: 1800 attendees for the main one Nicholas: whoa Valerie: wow Heather: a hundred or two for the smaller iEvoBio, informatics-related 9:05 AM yeah. lots of parallel sessions. lots of diversity, it was great. any quesitons or thoughts before we dive in talking about knoxville? 9:06 AM (feel free to say no, otherwise not clear how long is a long enough pause via chat...) Nicholas: I don't think so Heather: valerie? sarah? Valerie: do we need to have an article draft written by then or just the presentation? 9:07 AM Heather: Valerie I think in your case an article draft would be helpful Or at least an article outline Valerie: ok, cool Heather: In part to direct where your analysis goes for the last few weeks. 9:08 AM Sarah: sorry for the slow response. I'm good. I'm treating my powerpoint outline as my manuscript outline. Valerie: I'm sort of doing that as well Heather: ok, good. 9:09 AM Sarah: the knoxville data analysis will be my preliminary/test analysis for the final data Heather: well then two overall things before we dig into details on presentations.... Nicholas: me as well... I looked at the presentation as poster material that will get developed into an article Heather: yup, sounds good guys. your presentations reflect that. more on that in a minute.... 9:10 AM I just want to get your thoughts on a few other things first to make sure I don't forget. first: the proposed structure of the day has mentor presentations, then a break, then intern presentations back to back 9:11 AM I was thinking of suggesting that instead we have a fairly lengthy opportunity for discussion after each intern presentation something like 30mins what do you think? 9:12 AM opinions on discussion length, structure? Valerie: sure (although I will admit that a 30 minute discussion about my project sounds a bit intimidating/daunting) Heather: I appreciate that, Valerie :) 9:13 AM in part I think it is worth making sure we frame this as a working session Sarah: i think 30 min total would be good....15-20 presentation, 10-15 questions Heather: so the 30mins aren't critiques... they are truely discussion. Valerie: oh, I thought you had meant 30 minutes for discussion 30 minutes all together sounds fine Nicholas: maybe we ramp it up, so that initially we spend some time talking about each presentation, but spend more time when they are finished looking for overlaps?  just a suggestion Heather: I did actually think 30 mins for discussion, you interpreted me correctly 9:14 AM yup, nic. Valerie: that makes sense Heather: ok, thanks guys. so maybe 15-20mins after each, then more later. Sarah: yeah, maybe a group "think-tank" after we observed all the presentations would be good Valerie: agreed Sarah: with shorter q and a time after each presentation 9:15 AM Nicholas: yes Heather: btw I don't know if you ever get a chance to be in a doctoral consortium where there is 30mins of group discussino on your project after your talk.... if so, go for it. daunting, but very helpful. sarah, gotcha. I think we'll suggest a bit of both then. Nicholas: are those consortium's fellow doctoral candidates ? 9:16 AM Heather: yes, from different universities. conferences often offer them Nicholas: I often see them, but I wasn't sure the structure Heather: you get feedback from a variety of faculty from other unis + the other students. structure is of course different in different conferences. 9:17 AM anyway, worth applying, very useful. Valerie: good to know Heather: ok. another thought. what do you think of a 30 minute segment during the 2 days in which we talk about OWW 9:18 AM perhaps each of the three of you + me + maybe someone else present three slides: 1. How I used OWW, 2. what I liked, 3. what I didn't like Valerie: sure Sarah: sure. is the other group using it? have the mentors really embraced it (I'm feeling like not so much) 9:19 AM Heather: nope but at the NESCent session in the Evolution meeting, Todd touted your use of it as one of the "open science" initiatives going on Nicholas: ha Heather: associated with NESCent. ! :) 9:20 AM so clearly it has value, and I think it is worth taking the opportunity to figure out what worked what didn't Nicholas: is it being used in the wider dataOne project at all? Heather: how we/future interns/future scientists of this type might beable to use it better, etc. Not yet, outside of my own research, which I'm slowly starting to put up and do openly there. 9:21 AM sound reasonable? or does anyone have a different suggestion on how to structure an OWW take-away session? Valerie: maybe a panel discussion as opposed to three separate presentations? 9:22 AM Sarah: my only idea is that maybe we could walk through the online system, rather than having slides Valerie: yeah Heather: cool Sarah: a panel would be a good addition to that Nicholas: I'd like to hear how other are experiencing it so I think this sounds good Heather: so each of the three of you could perhaps click through a bit of your notebooks/pages while explaining how you did things, what you liked, what you didn't 9:23 AM Sarah: yeah, and more casual so people can pipe up when they have a question Heather: then at the end we have an open discussion about ways to do it better, suggeestions we might have for OWW, etc. perfect. Valerie: that sounds great Heather: I like the panel idea, though it makes me think given the nature of this meeting maybe we take it one level less organized that that and have it be a circle discussion or something. 9:24 AM ok. what do you think, 5-10 mins each for an OWW recap Valerie: sure Nicholas: that sounds good Heather: with room for discussion at the end, so maybe 45 mins total? 9:25 AM Valerie: sounds fine Sarah: good by me Heather: sarah, sounds right to you? ok, good. Anything else before we dig into presentations? I think the main structure of the meeting is just mentor presentations 9:26 AM then intern presentations, then group discussion, then breakout sessions so let me know if there is any other structured discusison you'd like to have 9:27 AM Presentations. Nice work on the content of what you've been doing. 9:28 AM One thing that struck me about all three is that you've really focused on the methods and interm results This makes sense, it is what presentations usually consist of In this case, though, we really are looking for more emphasis on motivation and background, as well as lessons learned and areas for discussion 9:29 AM as well as middle content so really 1/3, 1/3, 1/3 let me tell you why: not everyone in the room will have been thinking about these projects, so they'll appreciate the motivation. 9:30 AM even more, we want to hear what You think the motivation is. As you write it up, you'll need to describe the background, rationale, etc and so this is a great idea to get feedback on your understanding of it then, at the end, we want lots of emphasis on how the project went, what questions you have 9:31 AM since this is as much a learning experience as a research-output experience 9:33 AM Nicholas: when you say questions we have... you mean questions for the mentors with respect to what we've gathered? or questions in a larger sense about what our gathered data means related to each other? Heather: oops, I'm disconnected, do you get this? 9:34 AM you there? Nicholas: yes Valerie: yes Heather: sorry about that Nicholas: I don't think I received your reply Heather: I'm in a hotel lobby.... Sarah: yeah, i'm good Heather: here was the last bit this may mean that you need to streamline the methods and results slides you have right now.... keep the details in the back of your slidedeck in case we get there in questions or break-out sessions.... oops, I'm disconnected, do you get this? we want to hear how it went, what you've learned so, what do you think, does that make sense? Valerie: that does make sense 9:35 AM Nicholas: makes sense Sarah: i was planning on discussing my rationale as I went through the methods....is that part of what we're getting at? not just, "I did this", but " i decided to do this b/c" 9:36 AM Heather: some. but we're also looking for the motivation for doing the project. Now I know you guys didn't pick the projects so in some ways it isn't your motivation but it will be helpful to hear you say why you think doing this project is important 9:37 AM your view of how it relates to other work how you envision your results being used etc In no way intended to be a test of your understnding of the related literature that's not what it is about but tell us what you think, and then the mentors can help you fill in the gaps 9:38 AM with what they think :) Valerie: ah, yeah. I had been wondering if I had to do a prior review of data citation literature prior to writing the article/presentation this makes more sense now 9:39 AM Heather: And feel free to say "so I don't know why this bit matters" or whatever. It is intended to be that sort of presentation, rather than a polished finished product  sarah, nic, can you see that working for you? 9:40 AM Nicholas: yes... I think I've got some retooling to do but I think this makes sense Valerie: same here 9:41 AM Sarah: yeah, I'm adding some placeholder slides as we speak Heather: great. so aim for 1/3, 1/3, 1/3.  cool.  For the last third, one of the things that didn't make it into the early slide decks was a discusison of limitations  important. 9:42 AM Also, I think one or two of you mentioned future work, but not all....  that is useful too Sarah: meaning problems we've encounterd? i've added that and future work last night Heather: yeah, so problems you encountered, for sure Valerie: I put in a bit about stumbling and figured I'd elaborate in the actual presentation Heather: we definitely want that Sarah: since this is less formal, I think i'll also add a slide about questions this has drummed up but that i won't be able to address in the scope of this project Heather: yes, valerie, your stumbling stuff was great 9:43 AM yes sarah super I also mean limitations in the traditional paper sense so any method of doing research has limitations in generalizability, if you on;y looked a t a few journals 9:44 AM Nicholas: what we didn't look at why we disregarded it...right? Valerie: ah Heather: or in formal assessment, in valerie's case, since she didn't actually have a "gold standard" set of ideal results that she was comparing her search results to yes or nic, in your case, it could be 9:45 AM that you are only looking at the journal's written policies but you don't know how actively they enforce them or or you weren't looking at their instructions to reviewers or ??? basically, if you were to critique your own study design, what would you say are its limitations? 9:46 AM (doesn't mean it is a bad study, all studies have limitations) a classic limitation is that association does not implly causation, for example. does that help? Nicholas: yes Valerie: yes Sarah: yep 9:47 AM Heather: good. ok, have you guys had a chance to look at each other's slides? Valerie: a bit Sarah: briefly 9:48 AM Nicholas: yes Heather: how about this then rather than making this chat take hours Valerie: I like the Prezi layout, neat use of an infographic Nicholas: its really easy to learn and fun... I hope its ok Im using this format heather? Heather: (no kidding. Nic, that is neat! I hadn't seen tha tbefore) absolutely. 9:49 AM Nicholas: ok cool Heather: go nuts. ok, so why don' tyou have a look at each other's and actively make some comments on OWW etc 9:50 AM Sarah: will do Heather: I'll do that too Valerie: ok Heather: I did make a few specific notes I think that I'll add here right now but otherwise I'll add other comments on OWW talk pages too. 9:51 AM Here's what I had, in addition to comments already made: for all three: - Need the why - why this project - who is the audience - what will it help them do - what is previous work in this area (might not know some of this. that is ok. advisors will help) for the last third: - what format are you thinking for final output? where to submit? - anticipated shedule: what to do in the next few weeks? - limitations - things that didn't work well, were difficult - things you are worried about 9:52 AM especially worth highlighting the "anticipated schedule" part we only have really two weeks after Knoxville, right? Valerie: right Heather: Expectation isn't that you'll have finished papers done by then but expectation is that the data collection is done and analysis is ideally done too 9:53 AM so that places some tight restrictions on what the future-work can be, within the scope of the project so, using these presentations as working and communication documents, I think it would be in your best interests to make it really clear what you plan to try to accomoplish in those two weeks 9:54 AM so that others can give you feedback on that ok, other notes: Nicholas: just an fyi for Valerie and Sarah, July 23rd is final due date for IDCC http://www.dcc.ac.uk/events/conferences/6th-international-digital-curation-conference/papers... I'm trying to use that for my "due date" Heather: great. 9:55 AM Valerie: good to know Heather: There is also an ASIS&T poster deadline on July 16th (I think) that could be a deadline for people who were planning to attend that. In Pittsburgh in Oct this year I'd explcitly include your publication thoughts in your presentations so that you can get feedback on those 9:56 AM Nicholas: great Heather: Don't be afraid to be wrong or naive or ambitious or whatever. You are supposed to be learning, so put it all out there. Valerie: ok Heather: nic, a few points on your presentation content: 9:57 AM "better assess their influence" but why would we want to do that? - inform new policy - know what changes to lobby for in current policy - evidence to help convince people - know patterns so can look at other patterns - other reasons ??? 9:58 AM In general, I think your audience will find it easier to quickly absorb your metadata fields and extracted variables if they are in lists rather than paragraph form even though that means it may not all fit on one slide as nicely remember your audience is new to this stuff, so make it easily absorbable for them Nicholas: ok Heather: For repositories, are dois accession numbers? 9:59 AM curious to know how many offer dois... Also, I think on the funders page you list some numbers without having the denominator on the same page so for exampel if you are saying 3 is like this and 8 is like that, make sure the total number is also on the page 10:00 AM or maybe provide a percentage for some of the larger categories Nicholas: is it more effective to use % ? Heather: yeah it depends, but maybe yes. Nicholas: I wasn't sure, because the sample size is relatively small Heather: change them to %, and make sure that the total number is also on the same page ideally you'd have confidence intervals, but not clear you want to go there for this presetnation 10:01 AM also, if you have questions about your planned analysis, then metntion that. talk about your planned final output. is it just descriptive? is there a hypothesis (maybe not?) ? 10:02 AM include some "stumbles" to the extent that you had some it may be effective to extract a few of the policy sentences as examples though I know you'll be pushing it with time. that's what I had. Nicholas: time is 25min or 20 min? 10:03 AM Heather: 20 mins I think Nicholas: ok Heather: though if you've got other slides, keep them in the back and we can go through them in breakout sessions if it is helpful any questions on that feedback, or thoughts about why it wouldn't work, or ? 10:04 AM Nicholas: no no thats great Heather: ok! Valerie.... Nicholas: I've just been feeling a bit in the last few days, like I've analyzed a lot of policies and not found a lot of explicit direction Heather: explict direction in what way? Nicholas: so it's a bit hard to say... well I didn't find much Heather: ok. 10:05 AM so in general not a lot of relevant policies, is that right? Nicholas: I mean direction to authors, grantees depositors yes Heather: gotcha. and this bothers you? does it bother you because you think you've missed something? or ? 10:06 AM Nicholas: well, I'm just wondering if I took the best approach, in hindsight Heather: ok. Nicholas: maybe this is something that comes out of our discussion and we can assess it then Heather: great, well rather than trying to solve that for you right now, yes, exactly, bring it up. 10:07 AM that would work? Nicholas: yup Heather: ok. good. Nicholas: (sorry for the delay Valerie) Valerie: it's cool 10:08 AM Heather: Valerie, I love your stumbles section definitely pull it into the slides I think it conveys the results even more than the tables you have in there right now Valerie: ok, there are definitely more things I could add now that I think about it really? Heather: In some ways, for me, the comparisons between repositories are where the interesting stuff comes out 10:09 AM and some of that is lost when the repositories are on different slides as it is now Valerie: yeah, I tried at first to put all three on one slide Heather: so you could consider reformatting to make the point clear, yeah, I bet it could be hard Valerie: which resulted in 12 pt font. I remember someone telling me to use no smaller than 20 or 18 for ppt Heather: yeah. you could try to lose a bit of detail then. 10:10 AM or just have + ++ +++ for very good, not so good, etc or even better, flush out a bit what you mean by good it isn't clear is it that the results didn't return any hits Valerie: ok, that was something I meant to go into more detail about Heather: or they returned hits that were swamped with unrelated hits etc Valerie: ok 10:11 AM Heather: maybe you could imagine a key to make sort of a summary chart to explain what returned a lot of things, but useless, versus hardly any things but useful, etc. there are ways to plot that on a precision/recall curve, but wont' go into that right now plus you don't have a formal gold standard to compare against.... so just do what makes sense for now 10:12 AM Valerie: ok, that's good to know Heather: then I think I wrote this down before I read your stumbles section, but fyi.... what were some of the themes of the difficulties? how do these compare with what you expected? synonyms? lack of unique identifiers? what did you think they might be? what were they? and in general, as someone who is trying to find data reuses, what has been difficult? what tools would have made this easier? what policies or pratices by authors etc would have made it easier? 10:13 AM those sort of take-aways would have a lot of value I think not sure if my comments are clear, so ask if not. that's all I had 10:14 AM Valerie: ok, those are valid and reflect things I wasn't sure if I had detailed enough I'll revise the presentation based on your comments and add more on motivation/where to go from here in the coming weeks. Heather: yeah, your call, I'm not sure all of the things I raised can be covered within your time, so take it in the spirit intended. great. 10:15 AM Sarah, just had a few comments on yours more motivation, discussion of limitations, etc as above 10:16 AM Sarah: yeah, i've added space for that now Heather: Add a slide on "the unexpected" or something, articulating that it took longer than you might have guessed to standardize on fields, or coordinate data extraction, or sync with OWW, or whatever is true great Also, esp important for your because it is so data collection heavy... add a forecast for the next few weeks 10:17 AM How long it it taking you to do the extraction? Are you getting faster? Do you think a bio background is necessary? (basically lessons-learned on doing this sort of extraction) then how does this translate into data collection that you have planned for the next few days or weeks 10:18 AM and what analysis you're aiming to have done Sarah: ok. for now i'm giving a conservative estimate of having one more journal done (total of three) and compariative/correlative analyses Heather: perfect 10:19 AM so flush out future work, too. if someone (you? someone else?) were to try to collect a bigger set in the future, what would you recommend? I know you've already given that thought And then the part we haven't figured out at all yet is how we are going to sync up data across projects 10:20 AM Not sure how that is going to work, given our time contraints So I think mostly we highlight that as a plan, with actual feasibility unknown, and then 10:21 AM talk about it in our discussion sections what do you guys think? Sarah: i'm using nic's stuff on a a regular basis and have some ideas for syncing it...but that's on a limited basis since I'm only looking at 3 journals that usually only utilize 3 depositories Heather: ok good to know Nicholas: right, and I've collected some data especially for Sarah's journals Heather: great Nicholas: w/r/t funding sources 10:22 AM so it would be nice to have time to think about how we can bring that together Heather: yes so maybe all three of you, highlight in your presentaitons where you have some overlap Valerie: ok Heather: in data use, data collection columns, etc (I think you are doing this already) 10:23 AM and then we'll purposefully talk about integration within our data citation breakout sessions. future integration, I mean... ok. that's all I had. 10:24 AM When do you guys get in, will you be there for dinner on the 6th? Nicholas: I think I get in at 8 on the 6th Valerie: my flight gets in around 8 yeah 10:25 AM a late dinner? Heather: ok. Not so much dinner then. Sarah: i'm about the same i think Heather: ok. In that case we'll probably just see each other in the morning 10:26 AM I think they other group is getting together to go over slides in the morning but that'll be pretty early for the west-coasters amongst us 10:27 AM so I think the schedule (to be circulated soon) has us meeting at about 8:30 any other thoughts or feedback? 10:28 AM Nicholas: are we meeting at the hotel? Heather: good question, I'm not sure, but I'll get it cleared up Sarah: also, what's the schedule for the 8th? and are most people staying until the 9th? Valerie: I'm staying until the 9th Heather: I;ll be staying till the 9th Nicholas: I fly out the morning of the 9th Sarah: ok. good. 10:29 AM Heather: great. dinner on the 8th then! Sarah: sounds good Heather: (my bday, actually, so glad to have dinner companions ;) ) Nicholas: nice Heather: should we schedule a chat for early next week? Valerie: neat 10:30 AM sure Heather: thinking not, given the holiday? Valerie: oh yeah Heather: (though I'm in Canada, so for me the holiday is actually today, so doesn't matter) Valerie: ah, people were tweeting about Canada day 10:31 AM Heather: ok, so no scheduled chat?  but ping me/each other if want to chat Valerie: ok, definitely Heather: otherwise we'll just comment on slides via OWW 10:32 AM Valerie: sounds like a plan Heather: ok! Sarah you'll put this chat up? thanks guys Nicholas: bye, thanks Nicholas has left 10:33 AM Sarah: will do. bye! Valerie: thanks for the feedback  talk to you later