Code can be found --> HERE
Script is --> HERE
So, I have this data simulation lisp code lying around that I'm using to generate larger datasets from smaller ones using a GAC tree. We don't need to worry about the specifics but what I needed was a way to run it fast without jumping into slime and loading everything whenever I want to use it. Also, I'm using this lisp code to output to csv but when you write from emacs lisp there will be a newline every 80 characters printed into the file and that messes up how I want to use the csv's. AND, because lisp is so precise, it outputs floats as their division with a "/". I wrote a couple python scripts to clean that up quickly which I need to apply that to each file after I generate it. Perfect example of compounding a few steps into an easy to use bash script...
All I do is run something to the effect of this to output a big albrecht (size 1000) into a csv file named albrecht.csv:
./sampler albrecht 1000 albrecht.csv
It's not that fancy but it does one thing and it does it well, in the unix way.
For you black belts in the crowd, here's how I generate 2000 eg data files from all my existing data.
for file in `ls data/ | cut -d"." -f1`; do ./sampler $file 2000 examples/2000egs/$file.csv; done;
No comments:
Post a Comment