THE UMASS RAW PIXEL DATA ARCHIVE
The 10.4 TB of unprocessed pixel data taken at the 2MASS observing facilities from April 18,1997 to February 15, 2001 are archived at UMASS on 335 DLT 7000 tapes and on 35 300GB sata hard disks. The purpose of this document is to explain how to locate and access data in the archive. To this end sql-server tables for locating data and software for converting the data on tape and disk to a more user-friendly format are provided in this document. The source code for the conversion software also serves the purpose of describing the format of the binary data files.

The basic unit of 2MASS data is called a scan. There are 205,071 scans  and 1,230,414 files in raw pixel data archive. Not all of the data in the archive were taken when the sky conditions were photometric. The 2MASS scan data table (twomass_scn) distributed in the 2MASS All Sky Catalog DVD release contains entries for 59,731 scans which are known to be photometric and which provide 4-pi coverage of the sky. The scan data table schema and data can be downloaded from this page.

A specific scan is uniquely identified by three parameters that are listed for each source in the 2MASS catalogs, date, hemis, and scan. These parameters are defined in the catalog specification documents such as that prepared for the point source catalog. A user wishing to obtain raw data for a specific sky location rather than a specific 2MASS source should refer to the 2MASS scan data table (twomass_scn) which lists the boundaries of individual scans. Nearby 2MASS catalog sources and the 2MASS image server can also be used to identify scans useful for specific sky locations. Frequently a selected point on the sky will lie within more than one scan.

The raw data archive contains all data scans taken at the 2MASS observing sites. Some of these scans were not taken during photometric sky conditions. The scans may be bad for other reasons. Scans that appear in the 2MASS catalogs have passed the quality assurance review described in the explanatory supplement. The southern hemisphere camera was never warmed up to room temperature following initial commissioning. For this reason the properties of the southern NICMOS arrays were relatively stable throughout the 2MASS observations. The northern hemisphere camera was warmed to room temperature each summer during the period when the observatory was closed because of the Arizona monsoon season. It also had to be temperature cycled in the early years of the survey to release gas that had accumulated because of a insulating vacuum leak. The temperature cycling damaged the H-band array in this camera and it had to be changed during the survey. Twilight and dark frames used for analysis should be chosen to be those taken close to the date of the data being examined.  Section 4_2 of the explanatory supplement lists the periods of stable hardware configuration.   The operations managers' logs can also be checked for hardware and software changes.  


LOCATING THE DLT TAPE OR DISK WHERE THE DATA FOR A SELECTED SCAN ARE STORED
Labeled DLT tapes are stored in boxes through which the label is visible.  Disks are stored in labeled boxes. The raw_scans table can be searched on date-hemis-scan to determine the label of the DLT tape containing the data. For example,  the entry for scan 1 taken at Cerro Tololo with date 1997-06-07 shows that the data are stored on the DLT tape labeled raws0032 and on disk labeled s15.


    date    | day_no | hemis | scan | tile  | s_dir | s_type | frames | as_cat | tape_label | disk_label
------------+--------+-------+------+-------+-------+--------+--------+--------+------------+------------
 1998-10-10 |    589 | s     |    1 | 90234 | n     | cal    |     48 | n      | raws0032   | s15


The format description for the raw_scans table is given in the file format-rst.html. The raw scans table can be downloaded from the following links to ascii files:
As 2MASS data acquisition software was not stable prior to 1997-06-01 there may be non-standard entries in the columns of the raw_scans table prior to this date. A DLT tape contains data from only one observing site and the site location is encoded in the tape label (RAWNXXXXX or RAWSXXXX). A hard disk contains data from only one observing site and the site location is encoded in the disk label (nXX or sXX) Furthermore, the data from a specific date and location(hemis) are never stored on more than one tape or disk. A summary lgo table and a summary file inventory table in postgres dump format can be downloaded. There are duplicate entries for scans numbered 2 and 45 taken on 1997-04-23 at Mt. Hopkins in these tables.  

READING THE DLT TAPES

The 10.4TB 2MASS raw NICMOS pixel data are archived on 335 DLT 7000 tapes written without software compression. Create a directory with the same name as the tape label. From this directory execute the tape read bash script. This script will read all data (typically about 30 GB) from the tape into the current directory. The script will terminate with an error after the last file is read.

For each night of data (i.e. distinct date) read from the tape the directory will contain one .dir and one .lgo file. The name convention for these files is {date}{hemis}.{file type}. These ascii files can be read with a text editor. The .dir (directory) files can be used to verify that all of the data supposed to be on the tape were in fact read from the tape. The .lgo (log) files provide information similar to that given in the raw_scans table. Below a listing of the first 10 lines of the file 000529s.lgo (located on tape RAWS0136s) is shown below:

\CHAR Logfile  = 000529s.lgo
\CHAR ORDate   = 000529 \CHAR Hemisp   = s
\CHAR DayNum   = 1186
\--------------------------------------------------------------------------------------------
|Scn| sr_UT     |StrpID|R|sr_RA_2000 |sr_Dec_2000| Air_M|Frms|D|Typ| Comment                 |
| c | char      | int  |c| char      | char      | real | int|c| c | char                    |
\--------------------------------------------------------------------------------------------
 001 22:37:16.00  92397 - 11:21:49.10 -13:51:41.0 1.070   048 n cal
 002 22:38:46.00  92397 - 11:21:49.43 -12:34:52.9 1.080   048 s cal
 003 22:40:15.00  92397 - 11:21:49.76 -13:51:41.0 1.070   048 n cal


The first column in the table is the scan number. Frms is the number of NICMOS frame pairs (r1 and r2) in the scan. The next column after frms is the direction of the scan, n for north going and s for south going. Note that this is not the hemis(phere) or observatory location. The last column lists the type of scan. The .lgo files were parsed to create the raw-scans table.  

For each 2MASS scan three data files (one for each wavelength band) and three parameter files (one for each wavelength band) will be present in the current directory.  

Binary data files are named according to the convention {band}{day_no}{scan}.rds.gz.  Thus the file h1186063.rdo.gz is scan 63 taken on day number 1186 (date 000529) in H band. The file format is .rdo and it is gzipped. Each data file is accompanied by a parameter file
(for example, h1186063.par.gz) with the same name except that .rds has been replaced by .par. The ascii parameter file was generated by the observatory data acquisition computer system at the time of observation. It can be viewed with a text editor. Many of the parameters in this file appear in the fits headers of the coadded image archive files.

As the observatory location (hemisphere) is not encoded in the .par and .rds file names, data from different hemispheres must be kept in different directories.

READING THE SATA HARD DISKS

The sata hard disks are formatted with the reiser file system current as of Linux kernel 2.6.13.  The top level directory of the hard disks contains sub-directories named according to the convention {date}{hemisphere}.   Each of these sub-directories contains all of the files related to a specific date (date) and observatory location (hemisphere).  The files in these directories are as described in the previous section on reading DLT tapes.  

The top level directory also containes disk file and lgo inventories in postgres format. The disk label is encoded in a file name disk_XXX.

As the observatory location (hemisphere) is not encoded in the .par and .rds file names, data from different hemispheres must be kept in different directories.



FILE ERRORS

Four files in the tape archive are truncated. Gunzip will return an error when processing these files.

    date    | hemis | scan |      file       | s_type | frames | as_cat | tape_label | disk_label
------------+-------+------+-----------------+--------+--------+--------+------------+------------
 1998-09-21 | n     |   83 | k0570083.rds.gz | dar    |     48 | n      | rawn0057   | n12
 2000-04-06 | n     |  137 | k1133137.rds.gz | cal    |     48 | n      | rawn0141   | n03
 2000-05-19 | n     |  110 | k1176110.rds.gz | cal    |     48 | n      | rawn0147   | n02
 1999-05-08 | s     |  112 | k0799112.rds.gz | dar    |     48 | n      | raws0070   | s11


These file errors are not present in the the SATA hard disk archive as replacement files were obtained from IPAC for the hard disk archive. All files in the SATA hard disk archive have been checked using gunzip -t and no errors have been reported.


RAW DATA FILE FORMAT (.rds and .rdo files)
Raw 2MASS scans are stored as gzipped binary .rds files.  In the .rds files the data have been reordered in a manner that makes gzip compression more effective.   The rds2rdo program can be used to convert .rds files to .rdo files which have a more logical data format. The source code rds2rdo.c can be downloaded. For example, to convert the j-band data file j1186063.rds.gz to j1186063.rdo use the command :

zcat j1186063.rds.gz | ./rds2rdo > j1186063.rdo  

The converted file j1186063.rdo.gz may be downloaded to check for correct operation of the rds2rdo program. Source code for the reverse conversion rdo2rds.c has also been provided. There is no loss of information in the rds-rdo conversion and

cat oldfile | rds2rdo | rdo2rds  > newfile


is equivalent to a cp oldfile newfile command.

A scan consists of a series of frames (273 in the case of j1186063.rdo) separated in declination by approximately 1/6th of the image size. More details about the scanning procedure are given in the section on mapping. Two sets of data are recorded for each frame. The first 256 x 256 data block of unsigned 16 bit words (called read 1 or r1) contains data read from the NICMOS array 52 mS after the array was reset. The second 256 x 256 data block (called read 2 or r2) data that were read from the NICMOS array 1298 + 52 mS after the array was reset.  

A raw-data-to-fits file conversion program raw2fits may be used to create r2, r2, and r2-r1 fits images from the raw data. The source code for this program raw2fits.c illustrates the raw data file format. Note that it is possible for r2-r1 data to be negative for reasons that will be explained in a following section.  The source code for raw2fits illustrates the binary .rdo file format. The command

./raw2fits.c j1186063.rdo 220


causes the creation of three fits files from frame 220 of the scan j:   j1186063_r1_220.fits,   j1186063_r2_220.fits, and   j1186063_r2-r1_220.fits.

Processed J-band coadded image. Unprocessed image file j1186063_r2-r1_220.fits

The above images show that the orientation of the fits image produced by raw2fits is identical to that of the processed coadded image, that is with north on the top and east to the right. A faint feature called "reset decay", which will be described in a later section, is visible as the horizontally shaded band extending across the bottom and the center of the unprocessed image. The NICMOS arrays are electrically divided into four quadrants (barely visible in the unprocessed image above).  All four quadrants are read out of the array simultaneously. The first pixels read are in the lower left corner of the each quadrant. Readout proceeds from left to right (to the east). After the first row has been read the next row above is read from left to right etc. The last pixels read are in the upper right corner of each quadrant. Note that the quadrant structure of the array is not mirrored in the format of the .rdo binary data files. Very bright sources are saturated in both r1 and r2 images.  As a consequence r2-r1 images of very bright sources have a dark hole in the center since the saturation value of r1 and r2 is the same and if both are saturated the difference is zero. Bright sources that saturate in r2 but not in r1 have flat-topped profiles.


PROPERTIES OF THE RAW IMAGES

Processed coadded image. Unprocessed r2-r1 raw frame.

The above images illustrate the plate scale difference between a raw and a processed coadded image. The raw data from the NICMOS arrays has a plate scale of ~2 arc-seconds /pixel. Prior to coadding the raw data were repixilated into 1 arc-second pixels by an elegant algorithm suggested by Martin Weinberg. Dithering (i.e. the NICMOS array columns were not exactly aligned with declination) and declination scan rate were adjusted to achieve optimum sampling by the six or seven raw data frames that were coadded to form a processed image. Accurate values for the plate scales, dither angles, and scanning rates can be obtained from the fits headers of the files in the processed image archive.


DARK SCANS

Dark scans are used to characterize the electrical properties of the NICMOS arrays and digitizing electronics. These include the so-called jail bar artifacts visible in r1 and r2 frames, the reset decay artifact most visible in r2-r1 frames, and the stripeartifact. Hot pixels are easily identified in dark images. The no-illumination pixel data value was set to place the range of the analalog signals output from the camera within the range of the analog to digital converstion electronics. This setting was adjusted from time to time during the course of the survey. As there was only one offset adjustment for each array the offsets of the quadrants of the array are slightly different.

Dark scans were taken with a cold shutter in front of the NICMOS arrays. The j-band r1 image at the left shows the cold shutter in a partially closed position. From time to time dark frames were taken with the cold shutter not completely covering the arrays. For this reason dark frames should be viewed before used for analysis.
The jail bar artifact is clearly visible in the image to the left.  The full sized fits image rather than this 50 x 50 pixel lower left corner can be downloaded for examination by clicking on the image. The jail bar artifact appears in in identical fashion in r1 and r2 images and is not visible in r2-r1 images. 

The reading of a collection of hot pixels near column 6 in row 205 (counting from the left and bottom of the image) at the left side of the upper left quadrant has created a electrical disturbance that appears as a faint horizontal stripe in all four quadrants. The stripe appears in the same place in all four quadrants as they are being read out at the same time.

The reset decay artifactis visible at the dark band in the bottom rows of each quadrant in the image above and to the left. The reset decay transient is present in the r1 but not in the r2 images. It is visible in r2-r1 images. It decays away in an irregular manner as the pixels are read. The pattern of the decay is identical from frame-to-frame. The graph to the left shows the effect in r2-r1 along the second row from the bottom of the image. The size of this artifact is different for each NICMOS array. The vertical line visible at the center of the array between the quadrants is caused by transient that occurs when the first column of an array quadrant is read. It is most prominent in r1 and r2 images. This artifact is also present at the far left side of each quadrant. The amplitude of the effect changed when camera preamplifiers were modified during the course of the survey.

Sample .fits files of a k-band dark frame may be downloaded by clicking on the images below.

A Read 1 K-Band Dark Frame A Read 2 K-Band Dark Frame A Read 2 - Read 1 K-Band Dark Frame


TWILIGHT SCANS
Twilight scans are used to calibrate the relative sensitivity of the pixels in the NICMOS arrays.  A sequence of scans was taken during evening or morning twilight with the telescope scanning in declination and open to the sky. Stars may be visible in these scans. The twilight scans times were chosen to span a range if sky illumination. The use of these scans in the IPAC processing of the 2MASS catalog is described in the explanatory supplment.  

K-band r2-r1 twilight frame.  Plot of NICMOS data along blue line in figure to left.

Sample .fits files of a k-band twilight frame may be downloaded by clicking on the images below.


A Read 1 K-Band Twilight Frame A Read 2 K-Band Twilight Frame A Read 2-Read 1 K-Band Twilight  Frame


A CAUTION REGARDING THE USE OF TWILIGHT SCANS FOR ARRAY FLAT FIELDING

Subsequent to the processing and release of 2MASS data products it was discovered that pixels in the first column read in each quadrant of the j-band and h-band NICMOS arrays in the CTIO camera were miscalibrated for data taken after 990117. The miscalibration has its origin in an electronic nonlinearity that is present in the brighter twilight frames that were used for flat-fielding. The images below show for each pixel in the array the ratio (sky1-dark)/(sky2-dark) for two frames, sky1 and sky2, taken during changing twilight illumination. If the response of the camera electronics is linear this ratio should be the same for all pixels in the array. As can be seen below the ratio is quite uniform except in the area of background stars (black images), dead and hot pixels, and the central vertical column. The central column's deviation from linearity appears to be limited to twilight conditions in which all pixels of the array are illuminated. The deviation caused the IPAC processing, which assumed linearity, to underestimate the flux detected by pixels in the affected column by about 10% in K-band and somewhat less in H-band. Prior to the start of observations on 990118 a modification was made to the camera preamplifiers to increase their gain by a factor of two. This change was made to prepare the camera for use with Leach digitizing electronics. Although a similar modification was made to the Mt. Hopkins camera preamplifiers the effect described in this section does not appear in any of the Mt. Hopkins data.

K-band linearity test. 990118s H-band linearity test. 990118s

Darks.pdf is the IPAC software subsystem specification document that defines IPAC 2MASS dark and twilight frame processing.


SURVEY SCANS
Sample .fits files of a k-band survey frame may be downloaded by clicking on the images below.

Read 1 K-Band Survey Frame Read 2 K-Band Survey Frame Read  2-Read 1 K-Band  Survey Frame


ASTIGMATISM (Elongated Images)

Astigmatism was present in the 2MASS cameras in those bands (h and k) where the image had to pass through the thick substrate of dichroic beam splitters tilted at 45 degrees to the image path. As the k band image passed through two dichroic substrates and the h band image through only one substrate the camera astigmatism was greater in k band than h band. The Mt. Hopkins telescope optics were astigmatic from the start of survey operations until September 2000 when the primary mirror radial constraints were changed to be identical to those of the Cerro Tololo telescope which never exhibited astigmatism. The consequence of astigmatism is that out of focus images are elongated and that the direction of elongation rotates as the image is moved in and out of focus.  

Assstigmatic Image
A portion K band image from scan 98 coadd 56 taken at Mt. Hopkins on June 11, 1997. The file name is ki0980056. The images are elongated in the N-S direction because of astigmatism in the telescope and the camera optics. This scan is one of the most astigmatic scans to be used in the generation of the 2MASS catalogs.   

Data taken before July 13, 1997 at Mt. Hopkins are most affected by astigmatism.  On July 13, 1997 a 6 micron / degree C temperature compensation factor was introduced into the telescope control system to improve the accuracy of the focus setting.
As the 2MASS data were fitted with one dimensional point spread functions attempts made to improve the deblending of close binaries in the 2MASS data processing pipeline did not succeed and were not implemented.


NICMOS ARRAY SYSTEMATIC WITHIN-PIXEL SENSITIVITY VARIATIONS

2MASS NICMOS pixels exhibit systematic within-pixel sensitivity variations.  In particular, there is an insensitive region at the edge of the pixels and a region of reduced sensitivity at the middle of each pixel.  The NICMOS arrays were rotated slightly relative to the declination scan direction so that a source would sample various positions within a pixel as the scanning proceeded.  The measured array rotations are as follows:

J
H
Ks
Nscans
North1 0.372+/-.020 0.440+/-.022 0.326+/-.019 15814
North2 -0.303+/-.015 -0.420+/-.016 0.265+/-.015 9132
South 0.225+/-.006 0.252+/-.005 0.175+/-.005 34785
Array rotation relative to N-S in degrees.

The periods North1 and North2 refer to before and after the H-band array change made in August 1999. The 2MASS All-sky Catalog shows a variation in source counts as a function of dist_edge_ew with a period of one pixel (approximately 2") for data taken at CTIO. This has been attributed to the small value Ks-Band dither (0.8 pixel over the extent of the array) and to the good seeing at the CTIO site.  A similar variation is not seen in Mt. Hopkins data.  

e-w variation image
The above plot shows the distribution of sources in the All-Sky Catalog as a function of the dist_edge_ew.  This parameter is an approximate measure of of the position of the source on the arrays. The central hole is an artifact of the definition of dist_edge_ew.  


OTHER ARTIFACTS
Other camera and NICMOS array artifacts are described in the explanatory supplement. These include persistence (trails of images at the frame step spacing behind bright stars due to delayed collection of charge from the photon absorbing medium), ghosts (reflections in the optics, usually from the reverse side of dichroic mirrors), and glints (scattered light from array edges). Meteor trails, airplanes, and other features are also present in the raw data.

SAMPLE DATA

A sample data set suitable for processing all-sky catalog scan 63 taken at Cerro Tololo on 2000-05-29 are available for download below. A few .rdo files are provided for verifying the operation of  the rds2rdo conversion program. Entries in the raw_scans-table for these scans are listed below.

    date    | day_no | hemis | scan |  tile  | s_dir | s_type | frames | as_cat | tape_label | disk_label
------------+--------+-------+------+--------+-------+--------+--------+--------+------------+------------
 2000-05-29 |   1186 | s     |   63 | 318737 | n     | sci    |    274 | y      | raws0136   | s04
 2000-05-31 |   1188 | s     |  155 | 199998 | s     | flt    |     53 | n      | raws0136   | s04
 2000-05-31 |   1188 | s     |  156 | 199998 | s     | flt    |     53 | n      | raws0136   | s04
 2000-05-31 |   1188 | s     |  157 | 199998 | s     | flt    |     53 | n      | raws0136   | s04
 2000-05-31 |   1188 | s     |  158 | 199998 | s     | flt    |     53 | n      | raws0136   | s04
 2000-05-31 |   1188 | s     |  159 | 199998 | s     | flt    |     53 | n      | raws0136   | s04
 2000-06-02 |   1190 | s     |   90 | 199999 | n     | dar    |     48 | n      | raws0136   | s04


Dark j1190090.rdo.gz j1190090.rds.gz j1190090.par h1190090.rdo.gz h1190090.rds.gz h1190090.par k1190090.rdo.gz k1190090.rds.gz k1190090.par
Twilight j1188155.rds.gz j1188155.par h1188155.rds.gz h1188155.par k1188155.rds.gz k1188155.par
Twilight j1188156.rds.gz j1188156.par h1188156.rds.gz h1188156.par k1188156.rds.gz k1188156.par
Twilight j1188157.rds.gz j1188157.par h1188157.rds.gz h1188157.par k1188157.rds.gz k1188157.par
Twilight j1188158.rds.gz j1188158.par h1188158.rds.gz h1188158.par k1188158.rds.gz k1188158.par
Twilight j1188159.rds.gz j1188159.par h1188159.rds.gz h1188159.par k1188159.rds.gz k1188159.par
Survey j1186063.rdo.gz j1186063.rds.gz j1186063.par h1186063.rdo.gz h1186063.rds.gz h1186063.par k1186063.rdo.gz k1186063.rds.gz k1186063.par

Coadded repixilated fits format image files for scan 63 above are available for downloading. The file name format of these 512 x 1024 pixel fits image files is {band}i{scan number}{coadd number}.fits.bz2. Image file selections for downloading may be made here.


Requests for raw pixel data from the archive can be made to Rae Stiening at the University of Massachusetts.  Last edited April 28, 2007.