Kubke Lab:Research/ABR/Notebook/2013/11/06

From OpenWetWare
Jump to: navigation, search
Owwnotebook icon.png Hearing development in barn owls Report.pngMain project page
Resultset previous.pngPrevious entry      Next entryResultset next.png

General Entries

  • Insert content here...

Personal Entries

Fabiana

  • From file 2013-11-06-MFK.Rmd in Sandbox

Trying to organise file management

Need to:

  • open the folder, grab the list of files
  • separate log files from text files
  • put them in a data.frame with one column having txt files and another having log files
  • make sure that the names of txt and log files actually match, and put NA where one of the file pairs is missing


<html> <body> <h1>Trying to organise file management</h1>

<p> ) Change directory to case #<br> 2) get drectory list<br> 3) subset files with log onto one column<br> 3) subset files with txt onto a second column<br> </p>

<p>Structure of files is different for different case numbers:</p>

<p>Owl189 -&gt; 189###.LOG, 189###.TXT (99 objects: 98 files + WS_FTP.log)<br> Owl222 -&gt; 222###.LOG, 222###.TXT (31 objects: 30 files + WS_FTP.log)<br> Owl224 -&gt; 224###.LOG, 224###.TXT (39 objects: 38 files + WS_FTP.log)<br> Owl229 -&gt; 229###.LOG, 229###.TXT (34 objects: 33 files + WS_FTP.log) <br> Owl230 -&gt; 230###.LOG, 230###.txt (333 objects, ????)<br> Owl233 -&gt; 233###.ABR.log, 233###,ABR.txt(311 objects)<br> Owl335 -&gt; 335###.ABR.log, 335###.ABR.txt (172 objects)<br> Owl336 -&gt; 336###.abr.log, 336###.abr.txt(20 objects)<br> Owl416 -&gt; 416###.abr.log, 416###.abr.txt(358 objects)<br> Owl419 -&gt; 419###.abr.log, 419###.abr.txt (396 objects)<br></p>

<p>Missing txt files in <br> Owl 189 (49 log, 49 txt)<br> Owl 222 (15 log, 15 txt, one txt too small size)<br> Owl 224 (19 log, 19 txt, one txt too small size)<br> Owl 229 (18 log, 15 txt)<br> Owl 230 (several 0 KB files and several small txt files) Folder 230(check has 206 files?)<br> Owl 233 (several 0KB files and small txt files) (folder 233(P)only has 64 objects)<br> Owl 335 (several 0KB files and small txt files)<br> Owll 336 (10 log files, 10 txt)<br> Owl 416 (several empty files)<br> Owll 419 (several empty files)<br></p>

<p>testing with Owl222</p>

<pre><code class="r"># basedir&lt;- getwd() enter case to analyse: newdir &lt;- readline(&#39;enter case

  1. number: &#39;) create dir name as basedir\Analysis\datafiles\case: basedir
  2. &lt;- getwd() casedir &lt;- paste(basedir, newdir, sep = &#39;/&#39;) setwd(casedir)

</code></pre>

<p>or</p>

<pre><code class="r"># dir.create(file.path(basedir, casedir), showWarnings = FALSE)

  1. setwd(file.path(basedir, casedir))

</code></pre>

<p>1) Need to get files list<br> 2) Need to separate files as [whatever].log in one column and [whatever].log in another. <br> 3) Somehow I need to know if whatever on the same line do not match. <br></p>

<p>Back to testing on folder Owl222<br> Owl222 -&gt; 222###.LOG, 222###.TXT (31 objects: 30 files + WS_FTP.log)</p>

<pre><code class="r">setwd(&quot;~/Dropbox/OrisABR/Analysis/datafiles/OWL222&quot;) files &lt;- dir() head(files) </code></pre>

<pre><code>## [1] &quot;222L01.LOG&quot; &quot;222L01.TXT&quot; &quot;222L02.LOG&quot; &quot;222L02.TXT&quot; &quot;222L03.LOG&quot;

    1. [6] &quot;222L03.TXT&quot;

</code></pre>

<pre><code class="r"># need to separate the txt from the logfiles log &lt;- regexpr(&quot;(.*)[L|l][O|o][G|g]&quot;, files) logfiles &lt;- regmatches(files, log) length(logfiles) </code></pre>

<pre><code>## [1] 16 </code></pre>

<pre><code class="r">print(logfiles) </code></pre>

<pre><code>## [1] &quot;222L01.LOG&quot; &quot;222L02.LOG&quot; &quot;222L03.LOG&quot; &quot;222L04.LOG&quot; &quot;222L05.LOG&quot;

    1. [6] &quot;222L06.LOG&quot; &quot;222L07.LOG&quot; &quot;222L08.LOG&quot; &quot;222L09.LOG&quot; &quot;222L0A.LOG&quot;
    2. [11] &quot;222L0B.LOG&quot; &quot;222L0C.LOG&quot; &quot;222L0D.LOG&quot; &quot;222L0E.LOG&quot; &quot;222R18.LOG&quot;
    3. [16] &quot;WS_FTP.LOG&quot;

</code></pre>

<pre><code class="r"> txt &lt;- regexpr(&quot;(.*)[T|t][X|x][T|t]&quot;, files) txtfiles &lt;- regmatches(files, txt) length(txtfiles) </code></pre>

<pre><code>## [1] 15 </code></pre>

<pre><code class="r">print(txtfiles) </code></pre>

<pre><code>## [1] &quot;222L01.TXT&quot; &quot;222L02.TXT&quot; &quot;222L03.TXT&quot; &quot;222L04.TXT&quot; &quot;222L05.TXT&quot;

    1. [6] &quot;222L06.TXT&quot; &quot;222L07.TXT&quot; &quot;222L08.TXT&quot; &quot;222L09.TXT&quot; &quot;222L0A.TXT&quot;
    2. [11] &quot;222L0B.TXT&quot; &quot;222L0C.TXT&quot; &quot;222L0D.TXT&quot; &quot;222L0E.TXT&quot; &quot;222R18.TXT&quot;

</code></pre>

<p>Now need to put those into a single data frame, but make sure that the file names are matched for log and txt. So I am trying to compare the first 6 characters for each</p>

<p>I can assume that if I do not have a txt file, it is irrelevant whether I have a log file or not - so can step through the txtfiles line by line and look for the match on the logfile and then dump that on a dataframe where column 1 is txt files and column 2 is logfiles and if a log file is missing, then I can put a NaN</p>

<pre><code class="r">n &lt;- length(txtfiles) i = 1 traces &lt;- txtfiles[1:n] headers &lt;- logfiles[1:n] casefiles &lt;- data.frame(traces, headers)

  1. while(i&lt;n+1){ get traces[i]
  1. extract first any 6 characters at beginning of string in txt files
  2. (traces): &#39;^.{6}&#39;

test &lt;- regexpr(&quot;^.{6}&quot;, traces) test2 &lt;- regmatches(traces, test) print(test2) </code></pre>

<pre><code>## [1] &quot;222L01&quot; &quot;222L02&quot; &quot;222L03&quot; &quot;222L04&quot; &quot;222L05&quot; &quot;222L06&quot; &quot;222L07&quot;

    1. [8] &quot;222L08&quot; &quot;222L09&quot; &quot;222L0A&quot; &quot;222L0B&quot; &quot;222L0C&quot; &quot;222L0D&quot; &quot;222L0E&quot;
    2. [15] &quot;222R18&quot;

</code></pre>

<pre><code class="r">

  1. look for a match in logfiles (headers)

test3 &lt;- regexpr(&quot;^.{6}&quot;, headers) test4 &lt;- regmatches(headers, test3) print(test4) </code></pre>

<pre><code>## [1] &quot;222L01&quot; &quot;222L02&quot; &quot;222L03&quot; &quot;222L04&quot; &quot;222L05&quot; &quot;222L06&quot; &quot;222L07&quot;

    1. [8] &quot;222L08&quot; &quot;222L09&quot; &quot;222L0A&quot; &quot;222L0B&quot; &quot;222L0C&quot; &quot;222L0D&quot; &quot;222L0E&quot;
    2. [15] &quot;222R18&quot;

</code></pre>

<pre><code class="r">

  1. i=i+1 }

</code></pre>

<p>grab first of test2, and move down through test4 until I find a match, when I do, write the pair into casefiles$traces, casefiles$headers - but need to add the parts of the strings that I stripped so need to grab the filenames not from test 2 and test4 but rather from the actual full file names that are stored in traces and headers (using the i, j for location). PErhaps I can do the regexpr, regmatch on the individual rather than creating a new vector? write txtfiles[i] onto casefiles$txt and casefiles$log</p>

</body>

</html>

Andy

  • Enter content here

Oris

  • Enter content here