[OpenSPIM] storage strategies for SPIM datasets?

Wed Jun 17 08:14:07 CDT 2015

Hi Maarten, 

We ran into a similar dilemma with large volume data from a
spectral-detecting confocal.  Ultimately you either need to invest in an
enterprise-scale storage server or you have to redefine what Œraw¹ data
means.  We concluded that our spectral deconvolution algorithms were
pretty well optimized and unlikely to be revisited soon, so we archived
the three/four/five-channel spectrally deconvolved datasets as the Œraw¹
file and did not keep the 300 GB+ 32-channel original.  Friends who work
with X-ray data from free electron lasers face the same dilemma since a
single Œraw¹ dataset can add up to 100+ TBs.  The key is to work out how
confident you feel about your processing steps.  If you are likely to
revisit your alignment/cropping protocol then keep the raw data, at least
on platter drives in a closet.  Either way I recommend managing processed
files with OMERO.  I am a big fan of that software and very grateful to
the OME group for making it.

Best, 

Tim

Timothy Feinstein, Ph.D.
Research Scientist
University of Pittsburgh Department of Developmental Biology

On 6/17/15, 8:47 AM, "Maarten Hilbrant" <m.hilbrant at uni-koeln.de> wrote:

>Dear all,
>
>our SPIM (which in fact is an mDSLM) is up and running -wonderful.
>
>This also means that we're producing a lot of data very quickly
>(typically about 500GB per time lapse recording, about one recording per
>week). In an attempt to streamline our data analysis and storage
>workflow, I wondered what your experiences are.
>
>I currently just dump our data on our two 13.5TB NAS servers (a second
>set of NAS servers is used as a backup). Whenever I want to analyse a
>dataset, I first copy it to the 5TB internal hd of a reasonably powerful
>workstation, as reading/writing is often the limiting step for most
>analysis procedures. After cropping, making Z-projections, 3D
>registration/fusion etc I copy the results back to subfolders of the
>original data on the NAS server, creating an ever-growing swamp of data.
>Until I've finally convinced myself that it's ok to delete the raw data:-)
>
>This "workflow" has worked for me reasonably well, but it is rather
>difficult to keep an overview of all the data and I just know I'm
>occupying much more storage space than strictly necessary to answer our
>biological questions. So I'm investigating the possibilities for setting
>up a relational database to at least keep track of all the metadata,
>analyses performed etc. Ideally, such a database would include thumbnails
>of the imaging data and allow for easy import of metadata. Any ideas? Of
>course not very "open", but I'm tempted to use Filemaker Pro. Anything
>else seems like re-inventing the OMERO wheel. Does any one have
>experience with using OMERO for storing all data from large SPIM
>datasets? Or would you just store projections and keep the raw data
>somewhere else? Any ideas appreciated!
>
>cheers,
>Maarten
>
>
>
>_______________________________________________
>OpenSPIM mailing list
>OpenSPIM at openspim.org
>http://openspim.org/mailman/listinfo/openspim