Creating a CARMEN Data Repository

At CARMEN, we encourage users to make their data public so that the whole neuroscience community can benefit. If the data is substantial and collaborative, the best place is in a structured data repository. This page gives a few tips on how to create a public data repository on CARMEN.

Firstly, think about how you would like the repository structured. The data will be displayed to users as a file tree under your name in the public folder (please contact us if you want the repository to appear in the top-level of the public folder – an example of this is for the “Retinal Waves Repository (Gigascience)” in the public folder of CARMEN). Typically, the repository would have a base folder with a name describing the repository. Underneath the base folder, there might be subfolders containing data from specific laboratories, years, or groups of experiments, each containing a folder for individual experiments. Note that although the data will be visible in the public folder, you actually create the structure in your own data space. The process of publically sharing the data creates a link in the public folder that points to your data.

The next objective is to decide about metadata, and it’s granularity. When data files are upload to CARMEN, the upload process asks the user to complete a metadata form, and this document is then attached to each of the files in the upload. Ideally, as much of the metadata document should be completed as possible, as this aids searchability and gives more meaning to the data. If the whole database is uploaded in one go, all of the files in the repository will have the same metadata, and it will be difficult to distinguish the data files from each other in the CARMEN search facility. Alternatively, each file could be uploaded one at a time, so that each file will have it’s own unique metadata document. However, this approach could require a lot of effort even though CARMEN supports metadata templates . A compromise solution is typically used, whereby groups of files with commonality are uploaded together, and a common metadata form attached to them.

The format of the data that you want in your repository is vitally important. Preferably, the data should be in a standard data format, and contains all the information required by others to view and process the data. In CARMEN we have our own standard data format called Neuroinformatics Data Format (NDF) that we encourage. However, if you wish to use your own format that is fine, but we encourage you to provide a well-specified document detailing the format, so that others can make use of it.

The next stage is making the data public. Current default behaviour for the portal means that making any data public places a link to the data’s parent folder in the user’s public space, and any repository folder structure will be lost. There is an option however that keeps the folder structure you specify in your private space. To achieve this, on the sharing panel, select “Apply single sharing settings to data and metadata“, then select “public”, then a tick box should appear that asks “Create full path to shared data files” which should be selected. Note the tick box that says “Apply single sharing settings to all data associated with metadata” – Ticking this box enables all data associated with the same metadata document to have it’s sharing rights set at the same time.

Sharing panel options to keep data structure for public data

To further promote your data repository, we can create a page on this site giving information about the repository, such as data formatting. We can add URL links that point to the data and associated resources on CARMEN. The page can also contain links to  publications, documents, research groups, etc.

Please contact us if you have any problems or requests regarding building repositories, as we are always happy to help.

We are currently reviewing the presentation of public data and repositories on CARMEN, in order to make it easier to create, manage, and view the data.