Elizabeth Murray
  • home
  • research
    • phylogenomics in Aculeata
    • bee viruses
    • eucharitid ant parasitoids
  • publications
  • teaching
  • blog

Renaming multiple files using PowerShell

4/23/2017

0 Comments

 
For the PC owners: Have you ever used Windows PowerShell? It's something akin to Command Prompt, and can be utilized for task automation. It provides a really easy way to batch rename hundreds of files in a folder.
Specifically, I had hundreds of individual file names containing a "-" that needed to be replaced or removed, and I was looking for something straightforward for changing the names. I found Windows PowerShell, which is a management framework that was developed about ten years ago. I discovered it was pretty easy to use PowerShell for this task and others like it, and thought I'd share it here!

Here's how to get set up:
All your files should be in one folder, which you'll designate as your working directory. Open PowerShell on your PC. You can just type "powershell" into the search box in Windows 10. A window will open.
To navigate to your folder of files, type "cd" (change directory) at the prompt and then add a space. Drag and drop your working folder into the PowerShell window. The drag and drop puts the whole file path into place for you. Press enter and you'll see that you're now operating out of your folder. 
Picture
My files are all in a folder called "files_to_rename". This folder is now set as my working directory.

to do a batch rename on the files in your current directory:
Picture
original filenames
Picture
modified filenames
Picture
alternate renaming
There is just one line of code that you paste in to PowerShell.
dir | rename-item -NewName {$_.name -replace "-","_"}
Here, all you have to modify is what is found in the quotes. The 1st set of quotes contains the piece of the file to replace (a dash), and the 2nd set of quotes contains the string you want (an underscore). Alternatively, if you do not want any character, leave the second set of quotes empty, with no space (as in: -replace "-",""). If you have some files that don't match the string exactly, they will be ignored.
How does it work? You are 'piping' ("|") the contents of your directory ("dir") to the "rename-item" cmdlet (read: command-lette). If you were to enter only "dir" at the prompt, you would see all the files in your folder displayed; we use the last section of the script to run through all of these files.
I initially learned of this command at: https://www.howtogeek.com/111859/how-to-batch-rename-files-in-windows-4-ways-to-rename-multiple-files/
0 Comments

rjmcmc  [ reversible jump in MrBayes ]

12/17/2016

0 Comments

 
nst=mixed. This is one of my favorite commands in MrBayes. It's used instead of designating a nucleotide substitution model a priori. Instead, reversible jump is a form of model-averaging (across different dimensions of parameter space) where all possible time-reversible substitution models are explored, incorporating uncertainty in model selection.
Background:
​Of six the possible nucleotide substitution rates in the GTR family, MrBayes typically makes use of only half: designated using nst = 1 (F81/JC), nst = 2 (HKY/K2P), & nst = 6 (GTR/SYM).
So, nst = 3, 4, & 5 had been unavailable... until rjMCMC.

Reversible jump Markov chain Monte Carlo is a method to accommodate uncertainty in model selection. Using rjMCMC, MrBayes will incorporate all substitution models (6 rates, 203 models) and the Markov chain will sample a nucleotide substitution model in proportion to its marginal likelihood (spending the most time in best likelihood).  figure, Huelsenbeck et al. 2004
Picture
Picture
prior density for the rates when using rjMCMC; the number of possible rates (k) = 6
Picture
posterior density of rates (nst=1:6) for gene region 28S D2-D3; the highest density is in rates 3, 4, & 5
After a rjMCMC analysis, you could load MrBayes parameter files (.p) into Tracer to visualize which substitution rates were sampled the most often. [Stats are also output in the .pstat file.] This example shows the rate categories estimated for one subset in a partitioned analysis of a clade of eucharitid parasitoid wasps. All sampled rates contribute to the model. 
Likelihood-based analyses results are dependent on the model-fit to the data. The credible set of substitution models typically contains several different models and the benefit of ​rjMCMC is that it allows integration of all. Though tree topology isn't extremely sensitive to model misspecification, other parameters may be (Alfaro & Huelsenbeck 2006).

example of mrbayes block using rjMCMC:
[after the nexus alignment of data, use these commands for rjMCMC in MrBayes; this shows data grouped in two subsets]
BEGIN MRBAYES;
log start filename=murray-rj_log;  
    charset codon_pos1 = 1-1041\3;
    charset codon_pos2 = 2-1041\3;
    charset codon_pos3 = 3-1041\3;
partition two_subsets = 2: codon_pos1 codon_pos2, codon_pos3;
set partition = two_subsets;
lset applyto=(all) nst=mixed rates=gamma;   [each of the two subsets will be treated separately]
[use 'nst= mixed' for rjMCMC and 'rates=gamma' to incorporate rate heterogeneity; I don't use parameter for invariant sites]
    unlink shape=(all) pinvar=(all) statefreq=(all) revmat=(all);
    prset applyto=(all) ratepr=variable;
    mcmc ngen=1000000 samplefreq=100 filename=murray_rj;
sumt;  
sump;
[gives a default 25% burnin for tree and parameter files]
log stop;
end;
Also a great tool -- rjMCMC in BEAST 2! Just install the RBS plugin while in BEAUti and you are set to go. The drawback -- as of now (Dec. 2016) you cannot run the XML file in CIPRES if you are using reversible jump models from the RBS plugin.
note:
  • Implementing different models of nst=1,2, & 6 in MrBayes is dependent on fixing base frequencies to equal or unequal. For instance, nst=2 codes the HKY model, because the default here is unequal base frequencies, however, entering nst=2 but with equal state frequencies gives the model K2P. Here's a nice site on MrBayes substitution model commands: https://gist.github.com/brantfaircloth/895282 ​.
  • Assuming rate variation across sites (using model + I + G) is not a different substitution model than with no I + G.
references:
Huelsenbeck, J.P., Larget, B. & Alfaro, M.E. (2004) Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol Biol Evol, 21, 1123-33.
Alfaro, M.E., & Huelsenbeck, J.P. (2006). Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. Syst Biol, 55, 89-96.
0 Comments
Forward>>

    PhyloBlog

    Covering topics of phylogenetics and systematics & other science-related news.

    Archives

    October 2019
    June 2019
    March 2019
    November 2018
    October 2018
    September 2018
    December 2017
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    December 2016

    Categories

    All
    History & Context
    Programs & Packages
    Taxonomy & Morphology

    RSS Feed

Elizabeth A. Murray, ​PHYLOGENETICS AND EVOLUTION of Hymenoptera

@PhyloSolving  |  e.murray @ wsu.edu
  • home
  • research
    • phylogenomics in Aculeata
    • bee viruses
    • eucharitid ant parasitoids
  • publications
  • teaching
  • blog