TOPCAT - Tool for OPerations on Catalogues And Tables
Version 1.8

Cover image

Starlink User Note 253
Mark Taylor
13 October 2005

$Id: sun253.xml,v 1.80 2005/10/13 13:52:07 mbt Exp $

Starlink Project


Contents


Abstract

TOPCAT is an interactive graphical viewer and editor for tabular data. It has been designed for use with astronomical tables such as object catalogues, but is not restricted to astronomical applications. It understands a number of different astronomically important formats, and more formats can be added. It is designed to cope well with large tables; a million rows by a hundred columns should not present a problem even with modest memory and CPU resources.

It offers a variety of ways to view and analyse the data, including a browser for the cell data themselves, viewers for information about table and column metadata, and facilities for plotting, calculating statistics and joining tables using flexible matching algorithms. Using a powerful and extensible Java-based expression language new columns can be defined and row subsets selected for separate analysis. Selecting a row can be configured to trigger an action, for instance displaying an image of the catalogue object in an external viewer. Table data and metadata can be edited and the resulting modified table can be written out in a wide range of output formats.

TOPCAT is written in pure Java and is available under the GNU General Public Licence. Its underlying table processing facilities are provided by the Starlink Tables Infrastructure Library.


1 Introduction

TOPCAT is a graphical program which can examine, analyse, combine, edit and write out tables. A table is, roughly, something with columns and rows; each column contains objects of the same type (for instance floating point numbers) and each row has an entry for each of the columns (though some entries might be blank). A common astronomical example of a table is an object catalogue.

TOPCAT can read in tables in a number of formats from various sources, allow you to inspect and manipulate them in various ways, and if you have edited them optionally write them out in the modified state for later use, again in a variety of formats. Here is a summary of its main capabilities:

The general idea of the program is quite straightforward. At any time, it has a list of tables it knows about - these are displayed in the Control Window which is the first thing you see when you start up the program. You can add to the list by loading tables in, or by some actions which create new tables from the existing ones. When you select a table in the list by clicking on it, you can see general information about it in the control window, and you can also open more specialised view windows which allow you to inspect it in more detail or edit it. Some of the actions you can take, such as changing the current Sort Order, Row Subset or Column Set change the Apparent Table, which is a view of the table used for things such as saving it and performing row matches. Changes that you make do not directly modify the tables on disk (or wherever they came from), but if you want to save the changes you have made, you can write the modified table(s) to a new location.

The main body of this document explains these ideas and capabilities in more detail, and Appendix A gives a full description of all the windows which form the application. While the program is running, this document is available via the online help system - clicking the Help () toolbar button in any window will pop up a help browser open at the page which describes that window. This document is heavily hyperlinked, so you may find it easier to read in its HTML form than on paper.

Recent news about the program can be found on the TOPCAT web page. It has been written by the Starlink project. The underlying table handling facilities are supplied by the Starlink Tables Infrastructure Library STIL, which is documented more fully in SUN/252. It is written in pure Java (the current version requires J2SE1.4; it will run under version 1.5/5.0, but certain features don't work correctly) which makes it highly portable, since it can run on any machine which has a suitable Java installation; however some of the external viewer applications it talks to rely on non-Java code though so one or two facilities, such as displaying spectra, may be absent in some cases. TOPCAT is available under the terms of the GNU General Public License.


2 Apparent Table

The Apparent Table is a particular view of a table which can be influenced by some of the viewing controls.

When you load a table into TOPCAT it has a number of characteristics like the number of columns and rows it contains, the order of the rows that make up the data, the data and metadata themselves, and so on. While manipulating it you can modify the way that the table appears to the program, by changing or adding data or metadata, or changing the order or selection of columns or rows that are visible. For each table its "apparent table" is a table which corresponds to the current state of the table according to the changes that you have made.

In detail, the apparent table consists of the table as it was originally imported into the program plus any of the following changes that you have made:

The apparent table is used in the following contexts:

Data Window
The Data window always shows the rows and columns of the apparent table, so if you are in doubt about what form a table will get exported in, you can see what it looks like there.
Exports
When you save a table, or export it by dragging it off the Table List panel in the Control Window, or create a duplicate table, it is the apparent table which is copied. So for instance if you define a subset containing only the first ten rows of a table and then save it to a new table, or create a duplicate within TOPCAT using the Duplicate Table () toolbar button, the resulting table will contain only those ten rows.
Joins
When you use the Match Window or Concatenation Window to construct a new table on the basis of one or more existing ones, the new table will be built on the basis of the apparent versions of the tables being operated on.
Some of the other table view windows are affected too, for instance the Columns window displays its columns in the order that they appear in the Apparent Table.

2.1 Row Subsets

An important feature of TOPCAT is the ability to define and use Row Subsets. A Row Subset is a selection of the rows within a whole table being viewed within the application, or equivalently a new table composed from some subset of its rows. You can define these and use them in several different ways; the usefulness comes from defining them in one context and using them in another. The Subset Window displays the currently defined Row Subsets and permits some operations on them.

At any time each table has a current row subset, and this affects the Apparent Table. You can always see what it is by looking at the "Row Subset" selector in the Control Window when that table is selected; by default it is one containing all the rows. You can change it by choosing from this selector or as a result of some other actions.

Other contexts in which subsets can be used are picking a selection of rows from which to calculate in the Statistics Window and marking groups of rows to plot using different markers in the Plot Window.

2.1.1 Defining Subsets

You can define a Row Subset in one of the following ways:

Selecting rows in the browser
You can select a single row in the Data Window by clicking on it, or select a group of adjacent rows by dragging the mouse over them. You can add more rows to the selection by keeping the <Control> button pressed while you do it. Once you have a set of rows selected you can use the Subset From Selected Rows () or Subset From Unselected Rows () buttons to create a new subset based on the set of highlighted rows or their complement.

Combining this with sorting the rows in the table can be useful; if you do a Sort Up on a given column and then drag out the top few rows of the table you can easily create a subset consisting of the highest values of a given column.

Defining an algebraic expression
From the Subset Window using the Add New Subset () button will pop up the Algebraic Subset Window which allows you to define a new subset using an algebraic expression based on the values of the cells in each row. The format of such expressions is described in Section 6.
Visible plotted points
In the Plot Window you can plot columns against each other, and subsequently zoom in and out using the mouse. If you zoom to display only some of the plotted points and then use the New Subset From Visible () button then a new subset will be created containing only rows represented by points in the field of view of the plot at the time.
Selected plotted points
For more control over which plotted points are to be included in a subset, you can use the Draw Subset Region () button in the Plot Window. This allows you to trace out with the mouse a region or regions of any shape, creating a new subset containing only those rows represented by the points within those regions.
Boolean columns
Any column which has a boolean (true/false) type value can be used as a subset; rows in which it has a true value are in the subset and others are not. Any boolean column in a table is made available as a row subset with the same name when the table is imported.

In all these cases you will be asked to assign a name for the subset. As with column names, it is a good idea to follow a few rules for these names so that they can be used in algebraic expressions. They should be:

In the first two subset definition methods above, the current subset will be set immediately to the newly created one.

2.2 Row Order

You can sort the rows of each table according to the values in a selected column. Normally you will want to sort on a numeric column, but other values may be sortable too, for instance a String column will sort alphabetically. Some kinds of columns (e.g. array ones) don't have any well-defined order, and it is not possible to select these for sorting on.

At any time, each table has a current row order, and this affects the Apparent Table. You can always see what it is by looking under the "Sort Order" item in the Control Window when that table is selected; by default it is "(none)", which means the rows have the same order as that of the table they were loaded in from. The little arrow (/) indicates whether the sense of the sort is up or down. You can change the sort order by selecting a column name from this control, and change the sense by clicking on the arrow. The sort order can also be changed by using menu items in the Columns Window or right-clicking popup menus in the Data Window.

Selecting a column to sort by calculates the new row order by performing a sort on the cell values there and then. If the table data change somehow (e.g. because you edit cells in the table) then it is possible for the sort order to become out of date.

The current row order affects the Apparent Table, and hence determines the order of rows in tables which are exported in any way (e.g. written out) from TOPCAT. You can always see the rows in their currently sorted order in the Data Window.

2.3 Column Set

When each table is imported it has a list of columns. Each column has header information which determines the kind of data which can fill the cells of that column as well as a name, and maybe some additional information like units and Unified Content Descriptor. All this information can be viewed, and in some cases modified, in the Columns Window.

During the lifetime of the table within TOPCAT, this list of columns can be changed by adding new columns, hiding (and perhaps subsequently revealing) existing columns, and changing their order. The current state of which columns are present and visible and what order they are in is collectively known as the Column Set, and affects the Apparent Table. The current Column Set is always reflected in the order in which columns are displayed in the Data Window and Statistics Window. The Columns Window shows all the known columns, including hidden ones, in Column Set order; whether they are currently visible is indicated by the (leftmost) "Visible" column.

You can affect the current Column Set in the following ways:

Hide/Reveal columns
In the Columns Window you can toggle columns between hidden and visible by clicking on their box in the Visible column. To make a group of columns hidden or visible at once, select the corresponding rows (drag the mouse over them to select a contiguous group; hold the Control button down to add more single rows or contiguous groups to the selection) and hit the Hide Selected () or Reveal Selected () button in the toolbar or menu. Note when selecting rows, don't drag the mouse over the Visible column, do it somewhere in the middle of the table.

You can also hide a column by right-clicking on it in the Data Window, which brings up a popup menu - select the Hide option. To make it visible again you have to go to the Columns Window as above.

Move Columns
In the Data Window you can move columns around by dragging the grey column header left or right to a new position (as usual in a JTable). This affects the Column Set, as you can see if you watch the Columns Window while you do it.
Add Columns
You can use the New Synthetic Column () or New Sky Coordinate Columns () buttons in the Columns Window or the (right-click) popup menu in the Data Window to add new columns derived from exsiting ones.
Replace a Column
If a column is selected in the Columns Window or from the Data Window popup menu you can use the Replace Column with Synthetic () button. This is similar to the Add a Synthetic Column described in the previous item, but it pops up a new column dialogue with similar characteristics (name, units etc) to those of the column that's being replaced, and when completed it slots the new column in to the table hiding the old one.
Add a Subset Column
If you have defined a Row Subset somehow and you want it to appear explicitly in the table (for instance so that when you write the table out the selection is saved) you can select that subset in the Subsets Window and use the To Column () button, which will add a new boolean column to the table with the value true for rows part of that subset and false for the other rows.


3 Table Formats

TOPCAT supports a wide variety of tabular data formats. In most cases these are file formats for tables stored as single files on a disk or at the end of a URL, but there are other possibilities, for instance a table you have opened could be the result of an SQL query on a database.

Since you can load a table from one format and save it in a different one, TOPCAT can be used to convert a table from one format to another. If this is all you want to do however, you may find it more convenient to use the tcopy command line utility in the STILTS package.

The format handling is extensible, so new formats can be added fairly easily. All the table input/output is handled by STIL, the Starlink Tables Infrastructure Library; more detailed descriptions of the I/O capabilities can be found in its documentation.

The following subsections describe the available formats for reading and writing tables. The two operations are separate, so not all the supported input formats have matching output formats and vice versa.

3.1 Supported Input Formats

Loading tables into TOPCAT is done either from the command line when you start the program up or using the Load Table dialogue. For FITS and VOTable formats the file format can be detected automatically (note this is done by looking at the file content, it has nothing to do with filename extensions). For other formats though, for instance ASCII or Comma-Separated Values, you will have to specify the format that the file is in. In the Load Window, there is a selection box from which you can choose the format, and from the command line you use the -f flag - see Section 7 for details. You can always specify the format rather than using automatic detection if you prefer - this can be a good idea if a table appears to be failing to load in a surprising way, since it may give you a more detailed error message.

In either case, table locations may be given as filenames or as URLs, and any data compression (gzip, unix compress and bzip2) will be automatically detected and dealt with.

Note: in some earlier versions of TOPCAT, ASCII format tables could be detected automatically, so you could load them by typing something like "topcat table.txt". In the current version, you have to signal that this is an ASCII table, for instance by typing "topcat -f ascii table.txt".

The following sections describe the table formats which TOPCAT can read.

3.1.1 FITS

FITS binary and ASCII table extensions can be read. Unless told otherwise, TOPCAT will display the first TABLE or BINTABLE extension in a given FITS file. If a later extension is required, this is indicated by giving the extension number after a '#' at the end of the table location. The first extension (first HDU after the primary HDU) is numbered 1. Thus in a compressed FITS table named "spec23.fits.gz" with one primary HDU and two BINTABLE extensions, you would view the first one using the name "spec23.fits.gz" or "spec23.fits.gz#1" and the second one using the name "spec23.fits.gz#2". The suffix "#0" is never used for a legal FITS file, since the primary HDU cannot contain a table.

You can select which extension to use more conveniently than by specifying the HDU numbers if you use the Hierarchy Browser to load the table.

If the table has been written using TOPCAT's "fits-plus" output format (see Section 3.2.1) then the metadata will be read in from the primary HDU as well.

If the table is stored in a FITS binary table extension in a file on local disk in uncompressed form, then the table is 'mapped' into memory - this generally means fast loading and low memory use, even in the absence of TOPCAT's -disk flag (Section 7.1).

3.1.2 VOTable

VOTable is an XML-based format for tabular data endorsed by the International Virtual Observatory Alliance; while the tabular data which can be encoded is by design close to what FITS allows, it provides for much richer encoding of structure and metadata. TOPCAT is believed to read any table which conforms to the VOTable 1.0 or VOTable 1.1 specification. This includes tables in which the cell data are included in-line as XML elements (VOTable/TABLEDATA format), or included/referenced as a FITS table (VOTable/FITS) or included/referenced as a raw binary stream (VOTable/BINARY). TOPCAT does not attempt to be fussy about input VOTable documents, and it will have a good go at reading VOTables which violate the standards in various ways.

VOTable documents can have a complicated hierarchical structure, and may contain more than one actual table. Unless told otherwise, TOPCAT will load the first table it finds in the document, so in the (common) case that the document holds exactly one table, giving the filename will load that sole table. To display a table other than the first, you must indicate the zero-based index of the TABLE element in a breadth-first search after a '#' character at the end of the table specification. Here is an example VOTable document:

   <VOTABLE>
     <RESOURCE>
       <TABLE name="Star Catalogue"> ... </TABLE>
       <TABLE name="Galaxy Catalogue"> ... </TABLE>
     </RESOURCE>
   </VOTABLE>
If this is available in a file named "cats.xml" then open the Star Catalogue using the name "cats.xml" or "cats.xml#0", and the Galaxy Catalogue using the name "cats.xml#1".

3.1.3 ASCII

In many cases tables are stored in some sort of unstructured plain text format, with cells separated by spaces or some other delimiters. There is a wide variety of such formats depending on what delimiters are used, how columns are identified, whether blank values are permitted and so on. It is impossible to cope with them all, but TOPCAT attempts to make a good guess about how to interpret a given ASCII file as a table, which in many cases is successful. In particular, if you just have columns of numbers separated by something that looks like spaces, you should be just fine.

Here are the detailed rules for how the ASCII-format tables are interpreted:

If the list of rules above looks frightening, don't worry, in many cases it ought to make sense of a table without you having to read the small print. Here is an example of a suitable ASCII-format table:

    #
    # Here is a list of some animals.
    #
    # RECNO  SPECIES         NAME         LEGS   HEIGHT/m
      1      pig             "Pigling Bland"  4  0.8
      2      cow             Daisy        4      2
      3      goldfish        Dobbin       ""     0.05
      4      ant             ""           6      0.001
      5      ant             ""           6      0.001
      6      ant             ''           6      0.001
      7      "queen ant"     'Ma\'am'     6      2e-3
      8      human           "Mark"       2      1.8
In this case it will identify the following columns:
    Name       Type
    ----       ----
    RECNO      Short
    SPECIES    String
    NAME       String
    LEGS       Short
    HEIGHT/m   Float
It will also use the text "Here is a list of some animals" as the Description parameter of the table. Without any of the comment lines, it would still interpret the table, but the columns would be given the names col1..col5.

If you understand the format of your files but they don't exactly match the criteria above, the best thing is probably to write a simple free-standing program or script which will convert them into the format described here. You may find Perl or awk suitable languages for this sort of thing.

This format is not detected automatically - you must specify that you wish to load a table in ascii format.

3.1.4 Comma-Separated Values

Comma-separated value ("CSV") format is a common semi-standard text-based format in which fields are delimited by commas. Spreadsheets and databases are often able to export data in some variant of it. The intention is that TOPCAT can read tables in the version of the format spoken by MS Excel amongst other applications, though the documentation on which it was based was not obtained directly from Microsoft.

The rules for data which it understands are as follows:

This format is not detected automatically - you must specify that you wish to load a table in csv format.

3.1.5 SQL Database Queries

With appropriate configuration, TOPCAT can be used to examine the results of queries on an SQL-compatible relational database.

Database queries can be specified as a string in the form:

   jdbc:driver-specific-url#sql-query

The exact form is dependent on the driver. Here is an example for MySQL:
   jdbc:mysql://localhost/astro1?user=mbt#SELECT ra, dec FROM swaa WHERE vmag<18
which would get a two-column table (the columns being "ra" and "dec"), constructed from certain rows from the table "swaa" in the database "astro1" on the local host, using the access privileges of user mbt.

Fortunately you don't have to construct this by hand, there is an SQL Query Dialogue to assist in putting it together.

Note that TOPCAT does not view a table in the database directly, but the result of an SQL query on that table. If you want to view the whole table you can use the query

   SELECT * FROM table-name
but be aware that such a query might be expensive on a large table.

Use of SQL queries requires some additional configuration of TOPCAT; see Section 7.3.

3.1.6 World Data Center

Some support is provided for files produced by the World Data Centre for Solar Terrestrial Physics. The format itself apparently has no name, but files in this format look something like the following:

  Column formats and units - (Fixed format columns which are single space seperated.)
  ------------------------
  Datetime (YYYY mm dd HHMMSS)            %4d %2d %2d %6d      -
                                          %1s
  aa index - 3-HOURLY (Provisional)       %3d                  nT

  2000 01 01 000000  67
  2000 01 01 030000  32
      ...
Support for WDC tables is experimental - it may not be very robust.

This format is not detected automatically - you must specify that you wish to load a table in csv format.

3.2 Supported Output Formats

Writing out tables from TOPCAT is done using the Save Table Window. In general you have to specify the format in which you want the table to be output by selecting from the Save Window's Table Output Format selector; the following sections describe the possible choices. In some cases there are variants within each format - these are described as well.

The program has no "native" file format, but if you have no particular preference about which format to save tables to, FITS is a good choice. Uncompressed FITS tables do not in most cases have to be read all the way through (they are 'mapped' into memory), which makes them very fast to load up. The FITS format which is written by default (also known as "FITS-plus") also uses a trick to store extra metadata, such as table parameters and UCDs in a way TOPCAT can read in again later (see Section 3.2.1). These files are quite usable as normal FITS tables by other applications, but they will only be able to see the limited metadata stored in the FITS headers. If you want to write to a format which retains all metadata in a portable format, then one of the Section 3.2.2 formats might be better.

3.2.1 FITS

When saving in FITS format a new file is written consisting of two HDUs (Header+Data Units): a primary one (required by the FITS standard), and a single extension of type BINTABLE containing the table data.

There are two variants of this format:

fits-basic
The primary HDU contains only very minimal headers and no data.
fits-plus
The primary HDU contains an array of bytes which stores the full table metadata as the text of a VOTable document, along with headers that mark this has been done. Most FITS table readers will ignore this altogether and treat the file just as if it contained only the table. When TOPCAT (or other STIL-based applications) read it however, they read out the metadata and make it available for use. In this way you can store your data in the efficient and widely portable FITS format without losing the additional metadata such as table parameters, column UCDs, lengthy column descriptions etc that may be attached to the table. Other, more standard schemes exist for combining the benefits of FITS and VOTable, but suffer from some disadvantages: votable-fits-inline is hard to process efficiently (in particular the data cannot easily be mapped into memory) and votable-fits-href requires that you keep your data in two separate files, which can get separated from each other. If you want to ensure that the metadata are available to other VOTable-aware programs, you should use one of the normal VOTable formats.
In general, you can just let TOPCAT detect the format automatically and not worry about which of these variants is being used - if fits-plus is being used you just get some hidden benefits.

3.2.2 VOTable

When a table is saved to VOTable format, a document conforming to the VOTable 1.0 specification containing a single TABLE element within a single RESOURCE element is written.

There are a number of variants which determine the form in which the table data (DATA element) is written:

votable-tabledata
TABLEDATA element (pure XML)
votable-binary-inline
BINARY element containing base64-encoded data within the document
votable-fits-href
FITS element containing a reference to an external newly-written FITS file (with a name derived from that of the VOTable document)
votable-binary-href
BINARY element containing a reference to an external newly-written binary file (with a name derived from that of the VOTable document)
votable-fits-inline
FITS element containing base64-encoded data within the document
See the VOTable specification for more explanation of what these variants mean. They can all be read by the VOTable input handler.

3.2.3 ASCII

Tables can be written using a format which is compatible with the ASCII input format. It writes as plainly as possible, so should stand a good chance of being comprehensible to other programs which require some sort of plain text rendition of a table.

The first line is a comment (starting with a "#" character) which names the columns, and an attempt is made to line up data in columns using spaces. Here is an example of a short table written in this format:

   # index Species  Name   Legs Height Mammal
     1     pig      Bland  4    0.8    true  
     2     cow      Daisy  4    2.0    true  
     3     goldfish Dobbin 0    0.05   false 
     4     ant      ""     6    0.0010 false 
     5     ant      ""     6    0.0010 false 
     6     human    Mark   2    1.9    true  

3.2.4 Text

Tables can be written to a simple text-based format which is designed to be read by humans. No reader exists for this format.

Here is an example of a short table written in this format:

   +-------+----------+--------+------+--------+--------+
   | index | Species  | Name   | Legs | Height | Mammal |
   +-------+----------+--------+------+--------+--------+
   | 1     | pig      | Bland  | 4    | 0.8    | true   |
   | 2     | cow      | Daisy  | 4    | 2.0    | true   |
   | 3     | goldfish | Dobbin | 0    | 0.05   | false  |
   | 4     | ant      |        | 6    | 0.0010 | false  |
   | 5     | ant      |        | 6    | 0.0010 | false  |
   | 6     | human    | Mark   | 2    | 1.9    | true   |
   +-------+----------+--------+------+--------+--------+

3.2.5 Comma-Separated Values

Tables can be written to the semi-standard comma-separated value (CSV) format, described in more detail in Section 3.1.4. This can be useful for importing into certain external applications, such as some spreadsheets or databases. The first row written contains the column names.

3.2.6 SQL Tables

With appropriate configuration, TOPCAT can write out tables as new tables in an SQL-compatible relational database.

For writing, the location is specified as the following URL:

   jdbc:driver-specific-url#new-table-name

The exact form is dependent on the driver. Here is an example for MySQL:
   jdbc:mysql://localhost/astro1?user=mbt#newtab
which would write the current contents of the browser into a new table named "newtab" in the database "astro1" on the local host with the access privileges of user mbt.

Fortunately you do not have to construct this URL by hand, there is an SQL dialogue box to assist in putting it together.

Use of SQL queries requires some additional configuration of TOPCAT; see Section 7.3.

3.2.7 HTML

A table can be written out as an HTML 3.2 TABLE element, suitable for use as a web page or insertion into one.

There are two variants:

HTML
A freestanding HTML document, complete with HTML, HEAD and BODY tags is output.
HTML-element
Only the TABLE element representing the table is output; this should normally be embedded in a larger HTML document before use.

3.2.8 LaTeX

A table can be written out as a LaTeX tabular environment, suitable for insertion into a document intended for publication.

There are two variants:

LaTeX
The tabular element alone is output; this will have to be embedded in a larger LaTeX document before use.
LaTeX-document
A freestanding LaTeX document, consisting of the tabular within a table within a document is output.

Obviously, this isn't so suitable for very large tables.

3.2.9 Mirage Format

Mirage is a powerful standalone java tool developed at Bell Labs for analysis of multidimensional data. It uses its own file format for input. TOPCAT can write tables in the input format which Mirage uses, so that you can prepare tables in TOPCAT and write them out for subsequent use by Mirage.

It is also possible in principle to launch Mirage directly from within TOPCAT, using the Export To Mirage item on the Control Window's File menu; this will cause Mirage to start up viewing the currently selected Apparent Table. In order for this to work the Mirage classes must be on your classpath (see Section 7.2.1) when TOPCAT is run.

There appears to be a bug in Mirage which means this does not always work - sometimes Mirage starts up with no data loaded into it. In this case you will have to save the data to disk in Mirage format, start up Mirage separately, and load the data in using the New Dataset item in Mirage's Console menu.

Note that when Mirage has been launched from TOPCAT, exiting Mirage or closing its window will exit TOPCAT as well.

3.3 Custom I/O Formats

It is in principle possible to configure TOPCAT to work with table file formats other than the ones listed in this section. It does not require any upgrade of TOPCAT itself, but you have to write or otherwise acquire an input and/or output handler for the table format in question.

The steps that you need to take are:

  1. Write java classes which constitute your input and/or output handler
  2. Ensure that these classes are available on your classpath while TOPCAT is running (see Section 7.2.1)
  3. Set the startable.readers and/or startable.writers system property to the name of the handler classes (see Section 7.2.3)

Explaining how to write such handlers is beyond the scope of this document - see the user document and javadocs for STIL.


4 Joins and Matches

TOPCAT allows you to join two or more tables together to produce a new one in a variety of ways, and also to identify "similar" rows within a single table according to their cell contents. This section describes the facilities for performing these related operations.

There are two basic ways to join tables together: top-to-bottom and side-by-side. A top-to-bottom join (which here I call concatenation) is fairly straightforward in that it just requires you to decide which columns in one table correspond to which columns in the other. A side-by-side join is more complicated - it is rarely the case that row i in the first table should correspond to row i in the second one, so it is necessary to provide some criteria for deciding which (if any) row in the second table corresponds to a given row in the first. In other words, some sort of matching between rows in different tables needs to take place. This corresponds to what is called a join in database technology. Matching rows within a single table is a useful operation which involves many of the same issues, so that is described here too.

4.1 Concatenating Tables

Two tables can be concatenated using the Concatenation Window, which just requires you to specify the two tables to be joined, and for each column in the first ("Base") table, which column in the second ("Appended") table (if any) corresponds to it. The Apparent Table is used in each case. The resulting table, which is added to the list of known tables in the Control Window, has the same columns as the Base table, and a number of rows equal to the sum of the number of rows in the Base and Appended tables.

As a very simple example, concatenating these two tables:

   Messier   RA       Dec      Name
   -------   --       ---      ----
   97        168.63   55.03    Owl Nebula
   101       210.75   54.375   Pinwheel Galaxy
   64        194.13   21.700   Black Eye Galaxy
and
   RA2000    DEC2000   ID
   ------    -------   --
   185.6     58.08     M40
   186.3     18.20     M85
with the assignments RA->RA2000, Dec->DEC2000 and Messier->ID would give:
   Messier   RA       Dec      Name
   -------   --       ---      ----
   97        168.63   55.03    Owl Nebula
   101       210.75   54.375   Pinwheel Galaxy
   64        194.13   21.700   Black Eye Galaxy
   M40       185.6    58.08
   M85       183.6    18.20
Of course it is the user's responsibility to ensure that the correspondance of columns is sensible (that the two corresponding columns mean the same thing).

You can perform a concatenation using the Concatenation Window; obtain this using the Concatenate Tables () button in the Control Window.

4.2 Matching Rows Between Tables

When joining two tables side-by-side you need to identify which row(s) in one correspond to which row(s) in the other. Conceptually, this is done by looking at each row in the first table, somehow identifying in the second table which row "refers to the same thing", and putting a new row in the joined table which consists of all the fields of the row in the first table, followed by all the fields of its matched row in the second table. The resulting table then has a number of columns equal to the sum of the number of columns in both input tables.

In practice, there are a number of complications. For one thing, each row in one table may be matched by zero, one or many rows in the the other. For another, defining what is meant by "referring to the same thing" may not be straightforward. There is also the problem of actually identifying these matches in a relatively efficient way (without explicitly comparing each row in one table with each row in the other, which would be far too slow for large tables).

A common example is the case of matching two object catalogues - suppose we have the following catalogues:

    Xpos       Ypos        Vmag
    ----       ----        ----
   1134.822    599.247     13.8
    659.68    1046.874     17.2
    909.613    543.293      9.3
and
    x           y          Bmag
    -           -          ---- 
   909.523     543.800     10.1
   1832.114    409.567     12.3
   1135.201    600.100     14.6
    702.622   1004.972     19.0
and we wish to combine them to create one new catalogue with a row for each object which appears in both tables. To do this, you have to specify what counts as a match - in this case let's say that a row in one table matches (refers to the same object as) a row in the other if the distance between the positions indicated by their X and Y coordinates matches to within one unit (sqrt((Xpos-x)2 + (Ypos-y)2)<=1)). Then the catalogue we will end up with is:
    Xpos       Ypos        Vmag    x           y          Bmag
    ----       ----        ----    -           -          ---- 
   1134.822    599.247     13.8   1135.201    600.100     14.6
    909.613    543.293      9.3    909.523    543.800     10.1
There are a number of variations on this however - your match criteria might involve sky coordinates instead of Cartesian ones (or not be physical coordinates at all), you might want to match more than two tables, you might want to identify groups of matching objects in a single table, you might want the output to include rows which don't match as well...

The Match Window allows you to specify

and to start the matching operation. Depending on the type of match chosen, some additional columns may be appended to the resulting table giving additional details on how the match went. Usually, the 'match score' is one of these; The exact value and meaning of this column depends on the match, but it typically gives the distance between the matched points in some sensible units; the smaller the value, the better the match. You can find out exactly what this score means by examining the column's description in the Columns Window. Columns in the resulting table retain their original names unless that would lead to ambiguity, in which case a disambiguating suffix "_1" or "_2" is added to the column name.

To match two tables, use the Pair Match () button in the Control Window; to match more tables than two at once, use the other options on the Control Window's Join menu.

4.3 Matching Rows Within a Table

Although the effect is rather different, searching through a single table for rows which match each other (refer to the same object, as explained above) is a similar process and requires much of the same information to be specified, mainly, what counts as a match. You can do this using the Internal Match Window, obtained by using the Internal Match () button in the Control Window.

4.4 Notes on Matching

This section provides a bit more detail on the how the row matching is done. It is designed to give a rough idea to interested parties; it is not a tutorial description from first principles of how it all works.

The basic algorithm for matching is based on dividing up the space of possibly-matching rows into an (indeterminate) number of bins. These bins will typically correspond to disjoint cells of a physical or notional coordinate space, but need not do so. In the first step, each row of each table is assessed to determine which bins might contain matches to it - this will generally be the bin that it falls into and any "adjacent" bins within a distance corresponding to the matching tolerance. A reference to the row is associated with each such bin. In the second step, each bin is examined, and if two or more rows are associated with it every possible pair of rows in the associated set is assessed to see whether it does in fact consitute a matched pair. This will identify all and only those row pairs which are related according to the selected match criteria. During this process a number of optimisations may be applied depending on the details of the data and the requested match.

This means that the matching algorithm is basically an O(N log(N)) process, where N is the total number of rows in all the tables participating in a match. This is good news, since the naive interpretation would be O(N2). This can break down however if the matching tolerance is such that the number of rows associated with some or most bins gets large, in which case an O(M2) component can come to dominate, where M is the number of rows per bin. The average number of rows per bin is reported in the logging while a match is proceeding, so you can keep an eye on this.

For more detail on the matching algorithms, see the javadocs for the uk.ac.starlink.table.join package, or contact the author.


5 Activation Actions

As well as seeing the overview of table data provided by a plot or statistics summary, it is often necessary to focus on a particular row of the table, which according to the nature of the table may represent an astronomical object, an event or some other entity. In the Data Window a table row is simply a row of the displayed JTable, and in a plot it corresponds to one plotted point.

If you click on a point in a plot, or on a row in the Data Window, the corresponding table row will be activated. When a row is activated, three things happen:

  1. If the Plot Window is visible, a marker will be drawn centred on the point
  2. If the Data Window is visible, the table will be scrolled to show the row and it will be highlighted
  3. If an activation action has been defined, it will be invoked
The first two of these mean that you can easily see which point in a plot corresponds to which row in the table and vice versa - just click on one and the other will be highlighted.

The third one can be more complicated. By default, no activation action is set, so nothing else happens, and this may very well be what you want. However, by clicking on the Activation Action selector in the Control Window you can bring up the Activation Window which enables you to choose an additional action to take place. There are various options here and various ways to achieve them (see Appendix A.9 for more details) but the kinds of actions which are envisaged are to display one or more images or spectra relating to the row you have identified. One of the options available for instance retrieves a postage-stamp image of a few arcminutes around the sky position defined by the row from a SuperCOSMOS all-sky image survey and pops it up in a viewer window. So for instance having spotted an interesting point in a plot of a galaxy catalogue you can click on it, and immediately see a picture to identify its morphological type.

The exact actions you want to perform may be closely tailored to the data you have, for instance you may have a set of spectra on disk named by object ID. It's impossible to cater for such possibilities with a set of pre-packaged options, so you are able to define your own custom actions here. This is done by writing a expression using the syntax described in Section 6. A number of special functions (described in Section 6.5.2) are provided to do things like display an image or a spectrum in a browser (given its filename or URL), or access data from certain data servers on the web, but there is nothing to stop the adventurous plugging in their own external programs so in principle you can configure pretty much anything to happen on the basis of the values in the row that you have activated.


6 Algebraic Expression Syntax

TOPCAT allows you to enter algebraic expressions in three contexts:

  1. To define a new column in terms of existing columns in the Synthetic Column dialogue
  2. To define a new Row Subset on the basis of table data in the Algebraic Subset dialogue
  3. To define a custom Activation Action in the Activation dialogue.
This is a powerful feature which permits you to manipulate and select table data in very flexible ways - you can think of it like a sort of column-oriented spreadsheet. The syntax for entering these expressions is explained in this section.

What you write are actually expressions in the Java language, which are compiled into Java bytecode before evaluation. However, this does not mean that you need to be a Java programmer to write them. The syntax is pretty similar to C, but even if you've never programmed in C most simple things, and some complicated ones, are quite intutitive.

The following explanation gives some guidance and examples for writing these expressions. Unfortunately a complete tutorial on writing Java is beyond the scope of this document, but it should provide enough information for even a novice to write useful expressions.

The expressions that you can write are basically any function of all the column values and subset inclusion flags which apply to a given row; the function result can then define the per-row value of a new column, or the inclusion flag for a new subset, or the action to be performed when a row is activated by clicking on it. If the built-in operators and functions are not sufficient, or it's unwieldy to express your function in one line of code, you can add new functions by writing your own classes - see Section 6.8.

Note: if Java is running in an environment with certain security restrictions (a security manager which does not permit creation of custom class loaders) then algebraic expressions won't work at all, and the buttons which allow you to enter them will be disabled.

6.1 Referencing Cell Values

To create a useful expression for a cell in a column, you will have to refer to other cells in different columns of the same table row. You can do this in two ways:

By Name
The Name of the column may be used if it is unique (no other column in the table has the same name) and if it has a suitable form. This means that it must have the form of a Java variable - basically starting with a letter and continuing with letters or numbers. In particular it cannot have any spaces in it. The underscore and currency symbols count as letters for this purpose. Column names are treated case-insensitively.
By $ID
The "$ID" identifier of the column may always be used to refer to it; this is a useful fallback if the column name isn't suitable for some reason (for instance it contains spaces or is not unique). This is just a "$" sign followed by a unique integer assigned by the program to each column when it is first encountered. You can find out the $ID identifier by looking in the Columns Window.

There is a special column whose name is "Index" and whose ID is "$0". The value of this is the same as the row number in the unsorted table (the grey numbers on the left of the grid in the Data Window).

The value of the variables so referenced will be a primitive (boolean, byte, short, char, int, long, float, double) if the column contains one of the corresponding types. Otherwise it will be an Object of the type held by the column, for instance a String. In practice this means: you can write the name of a column, and it will evaluate to the numeric (or string) value that that column contains in each row. You can then use this in normal algebraic expressions such as "B_MAG - U_MAG" as you'd expect.

6.2 Referencing Row Subset Flags

If you have any Row Subsets defined you can also access the value of the boolean (true/false) flag indicating whether the current row is in each subset. Again there are two ways of doing this:

By Name
The name assigned to the subset when it was created can be used if it is unique and if it has a suitable form. The same comments apply as to column names above.
By _ID
The "_ID" identifier of the subset may always be used to refer to it. Like the "$ID" identifier for columns above, this is a unique index preceded by a special symbol, this time the underscore, "_".

Note: in previous versions of TOPCAT the hash sign ("#") was used instead of the underscore for this purpose; the hash sign no longer has this meaning.

In either case, the value will be a boolean value; these can be useful in conjunction with the conditional "? :" operator or when combining existing subsets using logical operators to create a new subset.

6.3 Null Values

When no special steps are taken, if a null value (blank cell) is encountered in evaluating an expression (usually because one of the columns it relies on has a null value in the row in question) then the result of the expression is also null.

It is possible to exercise more control than this, but it requires a little bit of care, because the expressions work in terms of primitive values (numeric or boolean ones) which don't in general have a defined null value. The name "null" in expressions gives you the java null reference, but this cannot be matched against a primitive value or used as the return value of a primitive expression.

For most purposes, the following two tips should enable you to work with null values:

Testing for null
To test whether a column contains a null value, prepend the string "NULL_" (use upper case) to the column name or $ID. This will yield a boolean value which is true if the column contains a blank or a floating point NaN (not-a-number) value, and false otherwise.
Returning null
To return a null value from a numeric expression, use the name "NULL" (upper case). To return a null value from a non-numeric expression (e.g. a String column) use the name "null" (lower case).

Null values are often used in conjunction with the conditional operator, "? :"; the expression

   test ? tval : fval
returns the value tval if the boolean expression test evaluates true, or fval if test evaluates false. So for instance the following expression:
   Vmag == -99 ? NULL : Vmag
can be used to define a new column which has the same value as the Vmag column for most values, but if Vmag has the "magic" value -99 the new column will contain a blank. The opposite trick (substituting a blank value with a magic one) can be done like this:
   NULL_Vmag ? -99 : Vmag
Some more examples are given in Section 6.7.

6.4 Operators

The operators are pretty much the same as in the C language. The common ones are:

Arithmetic
+ (add)
- (subtract)
* (multiply)
/ (divide)
% (modulus)
Logical
! (not)
&& (and)
|| (or)
^ (exclusive-or)
== (numeric identity)
!= (numeric non-identity)
< (less than)
> (greater than)
<= (less than or equal)
>= (greater than or equal)
Numeric Typecasts
(byte) (numeric -> signed byte)
(short) (numeric -> 2-byte integer)
(int) (numeric -> 4-byte integer)
(long) (numeric -> 8-byte integer)
(float) (numeric -> 4-type floating point)
(double) (numeric -> 8-byte floating point)
Note you may find the numeric conversion functions in the Maths class described in Section 6.5.1 below more convenient for numeric conversions than these.
Other
+ (string concatenation)
[] (array dereferencing)
?: (conditional switch)
instanceof (class membership)

6.5 Functions

Many functions are available for use within your expressions, covering standard mathematical and trigonometric functions, arithmetic utility functions, type conversions, and some more specialised astronomical ones, as well as providing actions to take when a point is activated. You can use them in just the way you'd expect, by using the function name (unlike column names, this is case-sensitive) followed by comma-separated arguments in brackets, so

    max(IMAG,JMAG)
will give you the larger of the values in the columns IMAG and JMAG, and so on.

The functions available for use by default are listed by class in the following subsections, one for general functions (used in defining new synthetic columns or row subsets) and the other for activation functions (used only for defining Activation Actions). More detailed documentation of what these functions do, the meaning of their parameters examples of use etc is available from within TOPCAT in the Available Functions Window.

6.5.1 General Functions

The following functions can be used anywhere that you can write an algebraic expression in TOPCAT. They will typically be used for defining new synthetic columns or algebraically-defined row subsets.

Times
Functions for conversion of time values between various forms. The forms used are
Modified Julian Date (MJD)
A continuous measure in days since midnight at the start of 17 November 1858. Based on UTC.
ISO 8601
A string representation of the form yyyy-mm-ddThh:mm:ss.s, where the T is a literal character (a space character may be used instead). Based on UTC.
Julian Epoch
A continuous measure based on a Julian year of exactly 365.25 days. For approximate purposes this resembles the fractional number of years AD represented by the date. Sometimes (but not here) represented by prefixing a 'J'; J2000.0 is defined as 2000 January 1.5 in the TT timescale.
Besselian Epoch
A continuous measure based on a tropical year of about 365.2422 days. For approximate purposes this resembles the fractional number of years AD represented by the date. Sometimes (but not here) represented by prefixing a 'B'.

Therefore midday on the 25th of October 2004 is 2004-10-25T12:00:00 in ISO 8601 format, 53303.5 as an MJD value, 2004.81588 as a Julian Epoch and 2004.81726 as a Besselian Epoch.

Currently this implementation cannot be relied upon to better than a millisecond.

isoToMjd( isoDate )
Converts an ISO8601 date string to Modified Julian Date. The basic format of the isoDate argument is yyyy-mm-ddThh:mm:ss.s, though some deviations from this form are permitted:
  • The 'T' which separates date from time can be replaced by a space
  • The seconds, minutes and/or hours can be omitted
  • The decimal part of the seconds can be any length, and is optional
  • A 'Z' (which indicates UTC) may be appended to the time
Some legal examples are therefore: "1994-12-21T14:18:23.2", "1968-01-14", and "2112-05-25 16:45Z".
dateToMjd( year, month, day, hour, min, sec )
Converts a calendar date and time to Modified Julian Date.
dateToMjd( year, month, day )
Converts a calendar date to Modified Julian Date.
mjdToIso( mjd )
Converts a Modified Julian Date value to an ISO 8601-format date-time string. The output format is yyyy-mm-ddThh:mm:ss.
mjdToDate( mjd )
Converts a Modified Julian Date value to an ISO 8601-format date string. The output format is yyyy-mm-dd.
mjdToTime( mjd )
Converts a Modified Julian Date value to an ISO 8601-format time-only string. The output format is hh:mm:ss.
formatMjd( mjd, format )
Converts a Modified Julian Date value to a date using a customisable date format. The format is as defined by the java.text.SimpleDateFormat class. The default output corresponds to the string "yyyy-MM-dd'T'HH:mm:ss"
mjdToJulian( mjd )
Converts a Modified Julian Date to Julian Epoch. For approximate purposes, the result of this routine consists of an integral part which gives the year AD and a fractional part which represents the distance through that year, so that for instance 2000.5 is approximately 1 July 2000.
julianToMjd( julianEpoch )
Converts a Julian Epoch to Modified Julian Date. For approximate purposes, the argument of this routine consists of an integral part which gives the year AD and a fractional part which represents the distance through that year, so that for instance 2000.5 is approximately 1 July 2000.
mjdToBesselian( mjd )
Converts Modified Julian Date to Besselian Epoch. For approximate purposes, the result of this routine consists of an integral part which gives the year AD and a fractional part which represents the distance through that year, so that for instance 1950.5 is approximately 1 July 1950.
besselianToMjd( besselianEpoch )
Converts Besselian Epoch to Modified Julian Date. For approximate purposes, the argument of this routine consists of an integral part which gives the year AD and a fractional part which represents the distance through that year, so that for instance 1950.5 is approximately 1 July 1950.

Strings
String manipulation and query functions.

concat( s1, s2 )
Concatenates two strings. In most cases the same effect can be achieved by writing s1+s2, but blank values can sometimes appear as the string "null" if you do it like that.
concat( s1, s2, s3 )
Concatenates three strings. In most cases the same effect can be achieved by writing s1+s2+s3, but blank values can sometimes appear as the string "null" if you do it like that.
concat( s1, s2, s3, s4 )
Concatenates four strings. In most cases the same effect can be achieved by writing s1+s2+s3+s4, but blank values can sometimes appear as the string "null" if you do it like that.
equals( s1, s2 )
Determines whether two strings are equal. Note you should use this function instead of s1==s2, which can (for technical reasons) return false even if the strings are the same.
equalsIgnoreCase( s1, s2 )
Determines whether two strings are equal apart from possible upper/lower case distinctions.
startsWith( whole, start )
Determines whether a string starts with a certain substring.
endsWith( whole, end )
Determines whether a string ends with a certain substring.
contains( whole, sub )
Determines whether a string contains a given substring.
length( str )
Returns the length of a string in characters.
matches( str, regex )
Tests whether a string matches a given regular expression.
matchGroup( str, regex )
Returns the first grouped expression matched in a string defined by a regular expression. A grouped expression is one enclosed in parentheses.
replaceFirst( str, regex, replacement )
Replaces the first occurrence of a regular expression in a string with a different substring value.
replaceAll( str, regex, replacement )
Replaces all occurrences of a regular expression in a string with a different substring value.
substring( str, startIndex )
Returns the last part of a given string. The substring begins with the character at the specified index and extends to the end of this string.
substring( str, startIndex, endIndex )
Returns a substring of a given string. The substring begins with the character at startIndex and continues to the character at index endIndex-1 Thus the length of the substring is endIndex-startIndex.
toUpperCase( str )
Returns an uppercased version of a string.
toLowerCase( str )
Returns an uppercased version of a string.
trim( str )
Trims whitespace from both ends of a string.
padWithZeros( value, ndigit )
Takes an integer argument and returns a string representing the same numeric value but padded with leading zeros to a specified length.

Maths
Standard mathematical and trigonometric functions.

E
Euler's number e, the base of natural logarithms.
PI
Pi, the ratio of the circumference of a circle to its diameter.
RANDOM
Evaluates to a random number in the range 0<=x<1. This is different for each cell of the table. The quality of the randomness may not be particularly good.
sin( theta )
Sine of an angle.
cos( theta )
Cosine of an angle.
tan( theta )
Tangent of an angle.
asin( x )
Arc sine of an angle. The result is in the range of -pi/2 through pi/2.
acos( x )
Arc cosine of an angle. The result is in the range of 0.0 through pi.
atan( x )
Arc tangent of an angle. The result is in the range of -pi/2 through pi/2.
exp( x )
Euler's number e raised to a power.
log10( x )
Logarithm to base 10.
ln( x )
Natural logarithm.
sqrt( x )
Square root. The result is correctly rounded and positive.
atan2( y, x )
Converts rectangular coordinates (x,y) to polar (r,theta). This method computes the phase theta by computing an arc tangent of y/x in the range of -pi to pi.
pow( a, b )
Exponentiation. The result is the value of the first argument raised to the power of the second argument.

Formats
Functions for formatting numeric values.

formatDecimal( value, dp )
Turns a floating point value into a string with a given number of decimal places.
formatDecimal( value, format )
Turns a floating point value into a formatted string. The format string is as defined by Java's java.text.DecimalFormat class.

Coords
Functions for angle transformations and manipulations. In particular, methods for translating between radians and HH:MM:SS.S or DDD:MM:SS.S type sexagesimal representations are provided.

DEGREE
The size of one degree in radians.
HOUR
The size of one hour of right ascension in radians.
ARC_MINUTE
The size of one arcminute in radians.
ARC_SECOND
The size of one arcsecond in radians.
radiansToDms( rad )
Converts an angle in radians to a formatted degrees:minutes:seconds string. No fractional part of the seconds field is given.
radiansToDms( rad, secFig )
Converts an angle in radians to a formatted degrees:minutes:seconds string with a given number of decimal places in the seconds field.
radiansToHms( rad )
Converts an angle in radians to a formatted hours:minutes:seconds string. No fractional part of the seconds field is given.
radiansToHms( rad, secFig )
Converts an angle in radians to a formatted hours:minutes:seconds string with a given number of decimal places in the seconds field.
dmsToRadians( dms )
Converts a formatted degrees:minutes:seconds string to an angle in radians. Delimiters may be colon, space, characters dm[s], or some others. Additional spaces and leading +/- are permitted.
hmsToRadians( hms )
Converts a formatted hours:minutes:seconds string to an angle in radians. Delimiters may be colon, space, characters hm[s], or some others. Additional spaces and leading +/- are permitted.
dmsToRadians( deg, min, sec )
Converts degrees, minutes, seconds to an angle in radians.

In conversions of this type, one has to be careful to get the sign right in converting angles which are between 0 and -1 degrees. This routine uses the sign bit of the deg argument, taking care to distinguish between +0 and -0 (their internal representations are different for floating point values). It is illegal for the min or sec arguments to be negative.

hmsToRadians( hour, min, sec )
Converts hours, minutes, seconds to an angle in radians.

In conversions of this type, one has to be careful to get the sign right in converting angles which are between 0 and -1 hours. This routine uses the sign bit of the hour argument, taking care to distinguish between +0 and -0 (their internal representations are different for floating point values).

skyDistance( ra1, dec1, ra2, dec2 )
Calculates the separation (distance around a great circle) of two points on the sky.
skyDistanceDegrees( ra1, dec1, ra2, dec2 )
Calculates the separation (distance around a great circle) of two points on the sky in degrees.
hoursToRadians( hours )
Converts hours to radians.
degreesToRadians( deg )
Converts degrees to radians.
radiansToDegrees( rad )
Converts radians to degrees.
raFK4toFK5( raFK4, decFK4 )
Converts a B1950.0 FK4 position to J2000.0 FK5 at an epoch of B1950.0 yielding Right Ascension. This assumes zero proper motion in the FK5 frame.
decFK4toFK5( raFK4, decFK4 )
Converts a B1950.0 FK4 position to J2000.0 FK5 at an epoch of B1950.0 yielding Declination This assumes zero proper motion in the FK5 frame.
raFK5toFK4( raFK5, decFK5 )
Converts a J2000.0 FK5 position to B1950.0 FK4 at an epoch of B1950.0 yielding Declination. This assumes zero proper motion, parallax and radial velocity in the FK5 frame.
decFK5toFK4( raFK5, decFK5 )
Converts a J2000.0 FK5 position to B1950.0 FK4 at an epoch of B1950.0 yielding Declination. This assumes zero proper motion, parallax and radial velocity in the FK5 frame.
raFK4toFK5( raFK4, decFK4, bepoch )
Converts a B1950.0 FK4 position to J2000.0 FK5 yielding Right Ascension. This assumes zero proper motion in the FK5 frame. The bepoch parameter is the epoch at which the position in the FK4 frame was determined.
decFK4toFK5( raFK4, decFK4, bepoch )
Converts a B1950.0 FK4 position to J2000.0 FK5 yielding Declination. This assumes zero proper motion in the FK5 frame. The bepoch parameter is the epoch at which the position in the FK4 frame was determined.
raFK5toFK4( raFK5, decFK5, bepoch )
Converts a J2000.0 FK5 position to B1950.0 FK4 yielding Declination. This assumes zero proper motion, parallax and radial velocity in the FK5 frame.
decFK5toFK4( raFK5, decFK5, bepoch )
Converts a J2000.0 FK5 position to B1950.0 FK4 yielding Declination. This assumes zero proper motion, parallax and radial velocity in the FK5 frame.

Conversions
Functions for coverting between strings and numeric values.

toString( value )
Turns a numeric value into a string.
parseByte( str )
Attempts to interpret a string as a byte (8-bit signed integer) value. If the input string can't be interpreted in this way, a blank value will result.
parseShort( str )
Attempts to interpret a string as a short (16-bit signed integer) value. If the input string can't be interpreted in this way, a blank value will result.
parseInt( str )
Attempts to interpret a string as an int (32-bit signed integer) value. If the input string can't be interpreted in this way, a blank value will result.
parseLong( str )
Attempts to interpret a string as a long (64-bit signed integer) value. If the input string can't be interpreted in this way, a blank value will result.
parseFloat( str )
Attempts to interpret a string as a float (32-bit floating point) value. If the input string can't be interpreted in this way, a blank value will result.
parseDouble( str )
Attempts to interpret a string as a double (64-bit signed integer) value. If the input string can't be interpreted in this way, a blank value will result.
toByte( value )
Attempts to convert the numeric argument to a byte (8-bit signed integer) result. If it is out of range, a blank value will result.
toShort( value )
Attempts to convert the numeric argument to a short (16-bit signed integer) result. If it is out of range, a blank value will result.
toInteger( value )
Attempts to convert the numeric argument to an int (32-bit signed integer) result. If it is out of range, a blank value will result.
toLong( value )
Attempts to convert the numeric argument to a long (64-bit signed integer) result. If it is out of range, a blank value will result.
toFloat( value )
Attempts to convert the numeric argument to a float (32-bit floating point) result. If it is out of range, a blank value will result.
toDouble( value )
Converts the numeric argument to a double (64-bit signed integer) result.

Arithmetic
Standard arithmetic functions including things like rounding, sign manipulation, and maximum/minimum functions.

roundUp( x )
Rounds a value up to an integer value. Formally, returns the smallest (closest to negative infinity) integer value that is not less than the argument.
roundDown( x )
Rounds a value down to an integer value. Formally, returns the largest (closest to positive infinity) integer value that is not greater than the argument.
round( x )
Rounds a value to the nearest integer. Formally, returns the integer that is closest in value to the argument. If two integers are equally close, the result is the even one.
roundDecimal( x, dp )
Rounds a value to a given number of decimal places. The result is a float (32-bit floating point value), so this is only suitable for relatively low-precision values. It's intended for truncating the number of apparent significant figures represented by a value which you know has been obtained by combining other values of limited precision. For more control, see the functions in the Formats class.
abs( x )
Returns the absolute value of an integer value. If the argument is not negative, the argument is returned. If the argument is negative, the negation of the argument is returned.
abs( x )
Returns the absolute value of a floating point value. If the argument is not negative, the argument is returned. If the argument is negative, the negation of the argument is returned.
max( a, b )
Returns the greater of two integer values. If the arguments have the same value, the result is that same value.
max( a, b )
Returns the greater of two floating point values. If the arguments have the same value, the result is that same value. If either value is blank, then the result is blank.
min( a, b )
Returns the smaller of two integer values. If the arguments have the same value, the result is that same value.
min( a, b )
Returns the smaller of two floating point values. If the arguments have the same value, the result is that same value. If either value is blank, then the result is blank.

More detail on these functions is available from within TOPCAT in the Available Functions window.

6.5.2 Activation Functions

The following functions can be used only for defining Activation Actions - they mostly deal with causing something to happen, such as popping up an image display window. They generally return a short string, which will be logged to the user to give a short indication of what happened (or didn't happen, or should have happened).

TwoQZ
Specialist functions for use with data from the the 2QZ survey. Spectral data are taken directly from the 2QZ web site at http://www.2dfquasar.org/.

TWOQZ_SPEC_BASE
String prepended to the object NAME for the FITS spectra file URL.
TWOQZ_SPEC_TAIL
String appended to the object NAME for the FITS spectra file URL.
TWOQZ_FITS_IMAGE_BASE
String prepended to the object NAME for the FITS postage stamp URL.
TWOQZ_FITS_IMAGE_TAIL
String appended to the object NAME for the FITS postage stamp URL.
TWOQZ_JPEG_IMAGE_BASE
String prepended to the object NAME for the JPEG postage stamp URL.
TWOQZ_JPEG_IMAGE_TAIL
String appended to the object NAME for the JPEG postage stamp URL.
spectra2QZ( name, nobs )
Displays all the spectra relating to a 2QZ object in an external viewer (SPLAT).
image2QZ( name )
Displays the postage stamp FITS image for a 2QZ object in an image viewer.
jpeg2QZ( name )
Displays the postage stamp JPEG image for a 2QZ object in an external viewer.
get2qzSubdir( name )
Returns the name of the subdirectory (such as "ra03_04") for a given 2QZ object name (ID).

System
Executes commands on the local operating system. These are executed as if typed in from the shell, or command line.

exec( cmd, arg1 )
Executes an operating system command with one argument.
exec( cmd, arg1, arg2 )
Executes an operating system command with two arguments.
exec( cmd, arg1, arg2, arg3 )
Executes an operating system command with three arguments.
exec( line )
Executes a string as an operating system command. Any spaces in the string are taken to delimit words (the first word is the name of the command).

SuperCosmos
Specialist display functions for use with the SuperCOSMOS survey. These functions display cutout images from the various archives hosted at the SuperCOSMOS Sky Surveys (http://www-wfau.roe.ac.uk/sss/). In most cases these cover the whole of the southern sky.

SSS_BASE_URL
Base URL for SuperCOSMOS image cutout service.
sssCutout( ra, dec, pixels )
Displays a cutout image in one of the available bands from the SuperCOSMOS Sky Surveys. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square. Sky coverage is complete.
sssCutout( ra, dec )
Displays a cutout image of default size in one of the available bands from the SuperCOSMOS Sky Surveys. Sky coverage is complete.
sssCutoutBlue( ra, dec, pixels )
Displays a cutout image of default size from one of the blue-band surveys from SuperCOSMOS. Sky coverage is complete.
sssCutoutRed( ra, dec, pixels )
Displays a cutout image of default size from one of the red-band surveys from SuperCOSMOS. Sky coverage is complete.
displayUkstB( ra, dec, pixels )
Displays a cutout image taken from the SuperCOSMOS Sky Surveys UK Schmidt Telescope Bj-band survey. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square.

Sky coverage is -90<Dec<+2.5 (degrees).

displayUkstR( ra, dec, pixels )
Displays a cutout image taken from the SuperCOSMOS Sky Surveys UK Schmidt Telescope R-band survey. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square.

Sky coverage is -90<Dec<+2.5 (degrees).

displayUkstI( ra, dec, pixels )
Displays a cutout image taken from the SuperCOSMOS Sky Surveys UK Schmidt Telescope I-band survey. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square.

Sky coverage is -90<Dec<+2.5 (degrees).

displayEsoR( ra, dec, pixels )
Displays a cutout image taken from the SuperCOSMOS Sky Surveys ESO R-band survey. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square.

Sky coverage is -90<Dec<+2.5 (degrees).

displayPossE( ra, dec, pixels )
Displays a cutout image taken from the SuperCOSMOS Sky Surveys Palomar E-band survey. The displayed image is square, and pixels pixels in the X and Y dimensions. Pixels are approximately 0.67 arcsec square.

Sky coverage is -20.5<Dec<+2.5 (degrees).

Splat
Functions for display of spectra in the external viewer SPLAT.

splat( label, loc )
Displays the resource at a given location as a spectrum in a spectrum viewer program (SPLAT). label may be any string which identifies the window for display, so that multiple (sets of) spectra may be displayed in different windows without getting in each others' way. loc should be a filename pointing to a spectrum in a format that SPLAT understands (includes FITS, NDF). In some cases, a URL can be used too.
splat2( label, loc1, loc2 )
Displays two spectra in the same (SPLAT) viewer. This may be useful to compare two spectra which correspond to the same table row.
splatMulti( label, locs )
Generic routine for displaying multiple spectra simultaneously in the same SPLAT plot.

Spectrum
Functions for general display of spectra in a window. Display is currently done using the SPLAT program, if available (http://www.starlink.ac.uk/splat/). Recognised spectrum formats include 1-dimensional FITS arrays and NDF files.

displaySpectrum( label, location )
Displays the file at the given location in a spectrum viewer.
displaySpectra( label, locations )
Displays the files at the given locations in a spectrum viewer. Each file represents a single spectrum, but they will be displayed within the same viewer window.

Sog
Functions for display of images in external viewer SOG (http://www.starlink.ac.uk/sog/).

sog( label, loc )
Displays the file at a given location as an image in a graphical (SoG) viewer. label may be any string which identifies the window for display, so that multiple images may be displayed in different windows without getting in each others' way. loc should be a filename or URL, pointing to an image in a format that SOG understands (this includes FITS, compressed FITS, and NDFs).

Sdss
Specialist display functions for use with the Sloane Digital Sky Server.

SDSS_BASE_URL
Base URL for SkyServer JPEG retrieval service.
sdssCutout( label, ra, dec, pixels )
Displays a colour cutout image of a specified size from the SDSS around a given sky position. The displayed image is square, a given number of (0.4arcsec) pixels on each side.
sdssCutout( ra, dec, pixels, scale )
Displays a colour cutout image of a specified size from the SDSS around a given sky position with pixels of a given size. Pixels are square, and their size on the sky is specified by the scale argument. The displayed image has pixels pixels along each side.
sdssCutout( ra, dec )
Displays a colour cutout image of a default size from the SDSS around a given sky position. The displayed image is 128 pixels square - a pixel is 0.4arcsec.

Output
Functions for simple logging output.

print( str )
Outputs a string value to the user log.
print( num )
Outputs a numeric value to the user log.

Mgc
Specialist functions for use with data from the the Millennium Galaxy Survey.

MGC_IMAGE_BASE
String prepended to MGC_ID for the FITS image URL.
MGC_IMAGE_TAIL
String appended to MGC_ID for the FITS image URL.
imageMgc( mgc_id )
Displays the postage stamp FITS image for an MGC object in an image viewer.

Image
Functions for display of images in a window. Supported image formats include GIF, JPEG, PNG and FITS, which may be compressed. The SoG program (http://www.starlink.ac.uk/sog/) will be used if it is available, otherwise a no-frills image viewer will be used instead.

displayImage( label, location )
Displays the file at the given location in an image viewer.

Browsers
Displays URLs in web browsers.

basicBrowser( url )
Displays a URL in a basic HTML viewer. This is only likely to work for HTML, text or RTF data. The browser can follow hyperlinks and has simple forward/back buttons, but lacks the sophistication of a proper WWW browser application.
systemBrowser( url )
Attempts to display a URL in the system's default web browser. Exactly what couts as the default web browser is system dependent, as is whether this function will work properly.
mozilla( url )
Displays a URL in a Mozilla web browser. Probably only works on Unix-like operating systems, and only if Mozilla is already running.
firefox( url )
Displays a URL in a Firefox web browser. Probably only works on Unix-like operating systems, and only if Firefox is already running.
netscape( url )
Displays a URL in a Netscape web browser. Probably only works on Unix-like operating systems, and only if Netscape is already running.
mozalike( cmd, url )
Displays a URL in a web browser from the Mozilla family; it must support flags of the type "-remote openURL(url)". Probably only works on Unix-like operating systems, and only if the browser is already running.

BasicImageDisplay
Functions for display of graphics-format images in a no-frills viewing window (an ImageWindow). Supported image formats include GIF, JPEG, PNG and FITS, which may be compressed.

displayBasicImage( label, loc )
Displays the file at a given location as an image in a graphical viewer. label may be any string which identifies the window for display, so that multiple images may be displayed in different windows without getting in each others' way. loc should be a filename or URL, pointing to an image in a format that this viewer understands.

More detail on these functions is available from within TOPCAT in the Available Functions window.

6.5.3 Technical Note

This note provides a bit more detail for Java programmers on what is going on here; only read on if you want to understand how the use of functions in TOPCAT algebraic expressions relates to normal Java code.

The expressions which you write are compiled to Java bytecode when you enter them (if there is a 'compilation error' it will be reported straight away). The functions listed in the previous subsections are all the public static methods of the classes which are made available by default. The classes listed are all in the packages uk.ac.starlink.ttools.func and uk.ac.starlink.topcat.func (uk.ac.starlink.topcat.func.Strings etc). However, the public static methods are all imported into an anonymous namespace for bytecode compilation, so that you write (sqrt(x,y) and not Maths.sqrt(x,y). The same happens to other classes that are imported (which can be in any package or none) - their public static methods all go into the anonymous namespace. Thus, method name clashes are a possibility.

This cleverness is all made possible by the rather wonderful JEL.

6.6 Instance Methods

There is another category of functions which can be used apart from those listed in the previous section. These are called, in Java/object-oriented parlance, "instance methods" and represent functions that can be executed on an object.

It is possible to invoke any of its public instance methods on any object (though not on primitive values - numeric and boolean ones). The syntax is that you place a "." followed by the method invocation after the object you want to invoke the method on, hence NAME.substring(3) instead of substring(NAME,3). If you know what you're doing, feel free to go ahead and do this. However, most of the instance methods you're likely to want to use have equivalents in the normal functions listed in the previous section, so unless you're a Java programmer or feeling adventurous, you are probably best off ignoring this feature.

6.7 Examples

Here are some examples for synthetic columns (i.e. expressions which return values to appear in the table):

Average
    (first + second) * 0.5
Square root
    sqrt(variance)
Angle conversion
    radiansToDegrees(DEC_radians)
    degreesToRadians(RA_degrees)
Conversion from string to number
    parseInt($12)
    parseDouble(ident)
Conversion from number to string
    toString(index)
Conversion between numeric types
     toShort(obs_type)
     toDouble(range)
or
    (short) obs_type
    (double) range
Conversion from sexagesimal to radians
    hmsToRadians(RA1950)
    dmsToRadians(decDeg,decMin,decSec)
Conversion from radians to sexagesimal
    radiansToDms($3)
    radiansToHms(RA,2)
Outlier clipping
    min(1000, max(value, 0))
Converting a magic value to null
    jmag == 9999 ? NULL : jmag
Converting a null value to a magic one
    NULL_jmag ? 9999 : jmag
Taking the third scalar element from an array-valued column
    psfCounts[2]
and here are some examples of boolean expressions that could be used to define row subsets (or to create boolean synthetic columns):
Within a numeric range
    RA > 100 && RA < 120 && Dec > 75 && Dec < 85
Within a circle
    $2*$2 + $3*$3 < 1
    skyDistance(ra0,dec0,degreesToRadians(RA),degreesToRadians(DEC))<15*ARC_MINUTE
First 100 rows
    index <= 100
Every tenth row
    index % 10 == 0
String equality/matching
    equals(SECTOR, "ZZ9 Plural Z Alpha")
    equalsIgnoreCase(SECTOR, "zz9 plural z alpha")
    startsWith(SECTOR, "ZZ")
    contains(ph_qual, "U")
String regular expression matching
    matches(SECTOR, "[XYZ] Alpha")
Combining subsets
    (_1 && _2) && ! _3
Test for non-blank value
    ! NULL_ellipticity

6.8 Adding User-Defined Functions

The functions provided by default for use with algebraic expressions, while powerful, may not provide all the operations you need. For this reason, it is possible to write your own extensions to the expression language. In this way you can specify abritrarily complicated functions. Note however that this will only allow you to define new columns or subsets where each cell is a function only of the other cells in the same row - it will not allow values in one row to be functions of values in another.

In order to do this, you have to write and compile a (probably short) program in the Java language. A full discussion of how to go about this is beyond the scope of this document, so if you are new to Java and/or programming you may need to find a friendly local programmer to assist (or mail the author). The following explanation is aimed at Java programmers, but may not be incomprehensible to non-specialists.

The steps you need to follow are:

  1. Write and compile a class containing one or more static public methods representing the function(s) required
  2. Make this class available on the application's classpath at runtime as described in Section 7.2.1
  3. Specify the class's name to the application, either as the value of the jel.classes or jel.classes.activation system properties (colon-separated if there are several) as described in Section 7.2.3 or during a run using the Available Function Window's Add Class () button

Any public static methods defined in the classes thus specified will be available for use in the Synthetic Column, Algebraic Subset or (in the case of activation functions only) Activation Window windows. They should be defined to take and return the relevant primitive or Object types for the function required (in the case of activation functions the return value should normally be a short log string). For instance a class written as follows would define a three-value average:

    public class AuxFuncs {
        public static double average3( double x, double y, double z ) {
            return ( x + y + z ) / 3.0;
        }
    }
and the expression "average3($1,$2,$3)" could then be used to define a new synthetic column, giving the average of the first three existing columns. Exactly how you would build this is dependent on your system, but it might involve doing something like the following:
  1. Writing a file named "AuxFuncs.java" containing the above code
  2. Compiling it using a command like "javac AuxFuncs.java"
  3. Starting up TOPCAT with the flags: "topcat -Djel.classes=AuxFuncs -classpath ."


7 Invoking TOPCAT

Starting up TOPCAT may just be a case of typing "topcat" or clicking on an appropriate icon and watching the Control Window pop up. If that is the case, and it's running happily for you, you can probably ignore this section. What follows is a description of how to start the program up, and various command line arguments and configuration options which can't be changed from within the program. Some examples are given in Section 7.4. Actually obtaining the program is not covered here; please see the TOPCAT web page http://www.starlink.ac.uk/topcat/.

There are various ways of starting up TOPCAT depending on how (and whether) it has been installed on your system; some of these are described below.

There may be some sort of short-cut icon on your desktop which starts up the program - in this case just clicking on it will probably work. Failing that you may be able to locate the jar file (probably named topcat.jar, topcat-full.jar or topcat-lite.jar) and click on that. These files would be located in the java/lib/topcat/ directory in a standard Starjava installation. Note that when you start by clicking on something you may not have the option of entering any of the command line options described below.

Alternatively you will have to invoke the program from the command line. If you have the full starjava installation on a Unix-like operating system, you can use the topcat script, which should be in the java/bin/ directory. So if that directory is on your path, you can write:

   topcat [java-args] [topcat-args]
In this case any arguments which start -D or -X are assumed to be arguments to the java command, a -classpath path defines a class path to be used in addition to the TOPCAT classes, and any remaining arguments are used by TOPCAT.

If you don't have the starjava Unix installation then to start from the command line you will have to use the java command itself. The most straightforward way of doing this will look like:

   java [java-args] -jar path/to/topcat.jar [topcat-args]
(or the same for topcat-full.jar etc). However NOTE: using java's -jar flag ignores any other class path information, such as the CLASSPATH environment variable or java's -classpath flag - see Section 7.2.1.

Note that Java Web Start can also be used to invoke the program without requiring any prior download/installation - sorry, this isn't documented properly here yet.

The meaning of the optional [topcat-args] and [java-args] sequences are described in Section 7.1 and Section 7.2 below respectively.

7.1 TOPCAT Command-line Arguments

You can start TOPCAT from the command line with no arguments - in this case it will just pop up the command window from which you can load in tables. However you may specify flags and/or table locations and formats.

If you invoke the program with the "-help" flag you will see the following usage message:

Usage: topcat <flags> [[-f <format>] <table> ...]

    General flags:
        -help      print this message and exit
        -version   print component versions etc and exit
        -verbose   increase verbosity of reports to console
        -demo      start with demo data
        -disk      use disk backing store for large tables
        -noserv    don't start SOAP services

    Optional load dialogue flags:
        -tree      hierarchy browser
        -file      basic file browser
        -sql       SQL query on relational database
        -cone      cone search dialogue
        -registry  VO registry query
        -siap      Simple Image Access Protocol queries

    Useful Java flags:
        -classpath jar1:jar2..  specify additional classes
        -XmxnnnM                use nnn megabytes of memory

    Auto-detected formats: 
        fits-plus, fits, votable

    All known formats:
        fits-plus, fits, votable, ascii, csv, wdc

    Useful system properties (-Dname=value - lists are colon-separated):
        java.io.tmpdir          temporary filespace directory
        jdbc.drivers            JDBC driver classes
        jel.classes             custom algebraic function classes
        jel.classes.activation  custom action function classes
        star.connectors         custom remote filestore classes
        startable.load.dialogs  custom load dialogue classes
        startable.readers       custom table input handlers
        startable.writers       custom table output handlers
        startable.storage       default storage policy

The meaning of the flags is as follows:
-f <format>
Signifies that the file(s) named after it on the command line are in a particular file format. Some file formats (VOTable, FITS) can be detected automatically by TOPCAT, but others (including Comma-Separated Values) cannot - see Section 3.1. In this case you have to specify with the -f flag what format the named files are in. Any table file on the command line following a -f <format> sequence must be in the named format until the next -f flag. The names of both the auto-detected formats (ones which don't need a -f) and the non-auto-detected formats (ones which do) are given in the usage message you can see by giving the -help flag (this message is shown above). You may also use the classname of a class on the classpath which implements the TableBuilder interface - see SUN/252.
-help
If the -help (or -h) flag is given, TOPCAT will write a short usage message and exit straight away.
-version
If the -version flag is given, TOPCAT will print a summary of its version and the versions and availability of some its components, and exit straight away.
-demo
The -demo flag causes the program to start up with a few demonstration tables loaded in. You can use these to play around with its facilities. Note these demo tables are quite small to avoid taking up a lot of space in the installation, and don't contain particularly sensible data, they are just to give an idea.
-disk
If the -disk flag is given then the program will use disk backing storage for caching table data that is read in, rather than keeping it in memory. This means that tables much larger than the heap memory assigned to Java can be used. It may lead to slower processing, but usually the performance is not greatly reduced. If you find TOPCAT running out of memory (you see OutOfMemoryErrors popping up in windows or on the console) then re-running with the -disk flag is a good idea. The temporary data files are written in the default temporary directory (defined by the java.io.tmpdir system property - often /tmp - and deleted when the program exits, unless it exits in an unusual way.

There are a couple of additional points to make here: firstly, uncompressed FITS binary tables are not read into memory in any case (they are mapped) so the -disk flag may not make much difference with FITS. Secondly, if you try to load tables which require temporary disk files bigger than the total amount of physical memory available, certain actions can result in disk thrashing and become very slow.

-verbose
The -verbose (or -v) flag increases the level of verbosity of messages which TOPCAT writes to standard output (the console). It may be repeated to increase the verbosity further. The messages it controls are currently those written through java's standard logging system - see the description of the Log Window for more information about this.
-noserv
By default, if the relevant classes are available, TOPCAT starts up an AXIS server which can accept requests using the SOAP protocol to display tables from other applications. Specifying the -noserv flag prevents the server being started.

Some of the flags control what load dialogues are visible in the Load Window. In fact all of these load dialogues can be accessed from the Load Window's DataSources menu as long as the classes are available, but if you specify these flags on the command line, the corresponding button will appear in the main part of the window, making the option more obvious. The load dialogue flags are:

-tree
Hierarchy browser (Appendix A.4.2), used for content-sensitive browsing of the filespace (not available with topcat-lite).
-file
Basic file browser. This doesn't do much that the Filestore Browser (which is present by default, and can also access remote filespaces) can't do, but is provided as a fallback.
-sql
SQL query (Appendix A.4.3), used for obtaining tables from an SQL query on a rleational database (only available if JDBC is set up correctly).
-cone
Cone search dialogue (Appendix A.4.4), used for obtaining catalogues of sources in a region of the sky from remote data servers (not available with topcat-lite).
-registry
Registry query dialogue (experimental), used for obtaining the results of a query on a VO resource registry as a table (not available with topcat-lite).
-siap
Simple Image Access Protocol dialogue (experimental), used for obtaining the results of SIAP queries on remote data servers (not available with topcat-lite).
These flags in most circumstances do just the same as adding the relevant dialogue class name to the startable.load.dialogs system property (see Section 7.2.3).

Other arguments on the command line are taken to be the locations of tables. Any tables so specified will be loaded into TOPCAT at startup. These locations are typically filenames, but could also be URLs or SQL queries, or perhaps something else. In addition they may contain "fragment identifiers" (with a "#") to locate a table within a given resource, so that for instance the location

   /my/data/cat1.fits#2
means the second extension in the multi-extension FITS file /my/data/cat1.fits.

Note that options to Java itself may also be specified on the command-line, as described in the next section.

7.2 Java Options

As described above, depending on how you invoke TOPCAT you may be able to specify arguments to Java itself (the "Java Virtual Machine") which affect how it runs. These may be defined on the command line or in some other way. The following subsections describe how to control Java in ways which may be relevant to TOPCAT; they are however somewhat dependent on the environment you are running in, so you may experience OS-dependent variations.

7.2.1 Class Path

The classpath is the list of places that Java looks to find the bits of compiled code that it uses to run an application. When running TOPCAT this always has to include the TOPCAT classes themselves - this is part of the usual invocation and is described in Section 7. However, for certain things Java might need to find some other classes, in particular for:

If you are going to use these facilities you will need to start the program with additional class path elements that point to the location of the classes required. How you do this depends on how you are invoking TOPCAT. If you are using tht topcat startup script, you can write:

    topcat -classpath other-paths ...
(this adds the given paths to the standard ones required for TOPCAT itself). If you are invoking java directly, then you can either write on the command line:
    java -classpath path/to/topcat.jar:other-paths
         uk.ac.starlink.topcat.Driver ...
or set the CLASSPATH environment variable something like this:
    setenv CLASSPATH path/to/topcat.jar:other-paths

In any case, multiple (extra) paths should be separated by colons in the other-paths string.

Note that if you are running TOPCAT using java's -jar flag, any attempt you make to specify the classpath will be ignored! This is to do with Java's security model. If you need to specify a classpath which includes more than the TOPCAT classes themselves, you can't use java -jar.

7.2.2 Memory Size

If TOPCAT fails during operation with a message that says something about a java.lang.OutOfMemoryError, then your heap size is too small for what you are trying to do. You will have to run java with a bigger heap size using the -Xmx flag. Invoking TOPCAT from the topcat script you would write something like:

    topcat -Xmx256M ...
or using java directly:
    java -Xmx256M ...
which means use up to 256 megabytes of memory (don't forget the "M" for megabyte). JVMs typically default to a heap size of 64M. You probably don't want to specify a heap size larger than the physical memory of the machine that you are running on.

There are other types of memory and tuning options controlled using options of the form -X<something-or-other>; if you're feeling adventurous you may be able to find out about these by typing "java -X".

Note however: using the -disk flag described in Section 7.1 may be a better solution; this makes the program store data from large tables on disk rather than in memory.

7.2.3 System properties

System properties are a way of getting information into Java (they are the Java equivalent of environment variables). The following ones have special significance within TOPCAT:

java.io.tmpdir
The directory in which TOPCAT will write any temporary files it needs. This is usually only done if the -disk flag has been specified (see Section 7.1).
jdbc.drivers
Can be set to a (colon-separated) list of JDBC driver classes using which SQL databases can be accessed (see Section 7.3).
jel.classes
Can be set to a (colon-separated) list of classes containing static methods which define user-provided functions for synthetic columns or subsets. (see Section 6.8).
jel.classes.activation
Can be set to a (colon-separated) list of classes containing static methods which define user-provided functions for use in custom activation expressions. (see Section 6.8).
star.connectors
Can be set to a (colon-separated) list of classes which provide access to remote filespace implementations. Thus-named classes should implement the uk.ac.starlink.connect.Connector interface which specifies how you can log on to such a service and provides a hierarchical view of the filespace it contains.
startable.load.dialogs
Can be set to a (colon-separated) list of custom table load dialogue classes. Briefly, you can install your own table import dialogues at runtime by providing classes which implement the uk.ac.starlink.table.gui.TableLoadDialog interface and naming them in this property. See STIL documentation for more detail.
startable.readers
Can be set to a (colon-separated) list of custom table format input handler classes (see Section 3.3).
startable.storage
Can be set to determine the default storage policy. Setting it to "disk" has basically the same effect as supplying the "-disk" argument on the TOPCAT command line (see Section 7.1).
startable.writers
Can be set to a (colon-separated) list of custom table format output handler classes (see Section 3.3).
votable.strict
Set true for strict enforcement of the VOTable standard when parsing VOTables. This prevents the parser from working round certain common errors, such as missing arraysize attributes on FIELD/PARAM elements with datatype="char". False by default.
apple.laf.useScreenMenuBar
On the Apple Macintosh platform only, this property controls whether menus appear at the top of the screen as usual for Mac, or at the top of individual windows as usual for Java. By default it is set to true for TOPCAT, so menus mostly appear at the top of the screen (though it's not true to say that TOPCAT obeys the Mac look and feel completely); if you prefer the more Java-like look and feel, set it to false.

To define these properties on the command line you use the -D flag, which has the form

    -D<property-name>=<value>
If you're using the TOPCAT startup script, you can write something like:
    topcat -Djdbc.drivers=org.postgresql.Driver ...
or if you're using the java command directly:
    java -Djdbc.drivers=org.postgresql.Driver ...

Alternatively you may find it more convenient to write these definitions in a file named .starjava.properties in your home directory; the above command-line flag would be equivalent to inserting the line:

    jdbc.drivers=org.postgresql.Driver
in your .starjava.properties file.

7.3 JDBC Configuration

This section describes additional configuration which must be done to allow TOPCAT to access SQL-compatible relational databases for reading (see Section 3.1.5) or writing (see Section 3.2.6) tables. If you don't need to talk to SQL-type databases, you can ignore the rest of this section. The steps described here are the standard ones for configuring JDBC (which sort-of stands for Java Database Connectivity), described in more detail on Sun's JDBC web page.

To use TOPCAT with SQL-compatible databases you must:

Installing the driver consists of two steps:
  1. Set the jdbc.drivers system property to the name of the driver class as described in Section 7.2.3
  2. Ensure that the classpath you are using includes this driver class as described in Section 7.2.1

These steps are all standard for use of the JDBC system.

To the author's knowledge, TOPCAT has so far successfully been used with the following RDBMSs and corresponding JDBC drivers:

MySQL
MySQL 3.23.55 on Linux has been tested with the Connector/J driver version 3.0.8 and seems to work, though tables with very many (hundreds of) columns cannot be written owing to SQL statement length restrictions. Note there is known to be a column metadata bug in version 3.0.6 of the driver which can cause a ClassCastException error when tables are written.
PostgreSQL
PostgreSQL 7.4.1 apparently works with its own driver. Note the performance of this driver appears to be rather poor, at least for writing tables.
Other RDBMSs and drivers may or may not work - please let us know the results of any experiments you carry out. Sun maintain a list of JDBC drivers for various databases; it can be found at http://servlet.java.sun.com/products/jdbc/drivers.

Here are a couple of command lines to start up TOPCAT using databases known to work.

PostgreSQL
   java -classpath topcat-full.jar:pg73jdbc3.jar \
        -Djdbc.drivers=org.postgresql.Driver \
        uk.ac.starlink.topcat.Driver
MySQL
   java -classpath topcat-full.jar:mysql-connector-java-3.0.8-bin.jar \
        -Djdbc.drivers=com.mysql.jdbc.Driver \
        uk.ac.starlink.topcat.Driver

7.4 Examples

Here are some examples of invoking TOPCAT from the command line. In each case two forms are shown: one using the topcat script, and one using the jar file directly. In the latter case, the java command is assumed to be on the your path, and the jar file itself, assumed in directory my/tcdir, might be named topcat.jar, topcat-full.jar, or something else, but the form of the command is the same.

No arguments
    topcat
    java -jar topcat.jar
Output usage message
    topcat -h
    java -jar topcat.jar -h
Load a FITS file
    topcat testcat.fits
    java -jar my/tcdir/topcat.jar testcat.fits
Loading files of various formats
    topcat t1.fits -f ascii t2.txt t3.txt -f votable t4.xml
    java -jar my/tcdir/topcat.jar t1.fits -f ascii t2.txt t3.txt -f votable t4.xml
Use disk storage format and boosted heap memory
    topcat -Xmx256M -disk 
    java -Xmx256M -jar my/tcdir/topcat.jar -disk
Make custom functions available
    topcat -classpath my/funcdir/funcs.jar -Djel.classes=my.ExtraFuncs t1.fits
    java -classpath my/tcdir/topcat.jar:my/funcdir/funcs.jar \
         -Djel.classes=func.ExtraFuncs \
         uk.ac.starlink.topcat.Driver t1.fits
Make PostgreSQL database connectivity available
    topcat -classpath my/jdbcdir/pg73jdbc3.jar -Djdbc.drivers=org.postgresql.Driver
    java -classpath my/tcdir/topcat.jar:my/jdbcdir/pg73jdbc3.jar \
         -Djdbc.drivers=org.postgresql.Driver uk.ac.starlink.topcat.Driver
Use custom I/O handlers
    topcat -classpath my/driverdir/drivers.jar \
           -Dstartable.readers=my.MyTableBuilder \
           -Dstartable.writers=my.MyTableWriter \
    java -classpath my/tcdir/topcat.jar:my/driverdir/drivers.jar \
         -Dstartable.readers=my.MyTableBuilder \
         -Dstartable.writers=my.MyTableWriter \
         uk.ac.starlink.topcat.Driver
The -Dx=y definitions can be avoided by putting equivalent x=y lines into the .starjava.properties in your home directory.


A TOPCAT Windows

This appendix gives a tour of all the windows that form the TOPCAT application, explaining the anatomy of the windows and the various tools, menus and other controls. Attributes common to many or all windows are described in Appendix A.1, and the subsequent sections describe each of the windows individually.

When the application is running, the Help For Window () toolbar button will display the appropriate description for the window on which it is invoked.


A.1 Common Window Features

This section describes some features which are common to many or all of the windows used in the TOPCAT program.

A.1.1 Toolbar

Each window has a toolbar at the top containing various buttons representing actions that can be invoked from the window. Most of them contain the following buttons:

Control Window
Ensures that the Control Window is visible on the screen, deiconifying and raising it if necessary. This can be useful if you 'lose' the window behind a proliferation of other ones.
Close
Closes the window. This convenience button has the same effect as closing the window in whatever way your graphics platform normally allows. In most cases, closing the window does not lose state associated with it (such as fields filled in); if you recover the window later it will look the same as when you closed it.
Help
Pops up a Help browser window, or makes sure it is visible if it has already been opened. The window will display help information relevant to the window in which you pushed this button.

Buttons in the toolbar often appear in menus of the same window as well; you can identify them because they have the same icon. This is a convenience; invoking the action from the toolbar or from the menu will have the same effect.

Often an action will only be possible in certain circumstances, for instance if some rows in the associated JTable have been selected. If the action is not possible (i.e. it would make no sense to invoke it) then the button in the toolbar and the menu option will be greyed out, indicating that it cannot be invoked in the current state.

A.1.2 Menus

Most windows have a menu bar at the top containing one or more menus. These menus will usually provide the actions available from the toolbar (identifiable because they have the same icons), and may provide some other less-commonly-required actions too.

Here are some of the menus common to several windows:

File menu
Nearly all windows have this menu. At least the following options are available:
Close
Closes the window. This convenience button has the same effect as closing the window in whatever way your graphics platform normally allows. In most cases, closing the window does not lose state associated with it (such as fields filled in); if you recover the window later it will look the same as when you closed it.
Exit
Exits TOPCAT. You will be prompted to confirm this action if tables are loaded, since it might result in loss of data.
Help menu
Nearly all windows have this menu. The following options are available:
Help
Pops up the Help Window.
Help For Window
Pops up the Help Window; the window will display help information relevant to the window in which the menu appears.
About TOPCAT
Pops up a little window giving information on the version and authorship of the program. It also reports on availability of some optional components.
Display menu
This menu is available for most windows which display their data using a JTable component. If present, it contains a list of the columns in the JTable with tickboxes next to them - clicking on a column name in this menu toggles whether the column is visible in the window.

A.1.3 JTables

An example JTable

An example JTable

Many of the windows, including the Data Window, display their data in a Java widget called a JTable. This displays a grid of values, with headings for each column, in a window which you can scroll around. Although JTables are used for a number of different things (for instance, showing the table data themselves in the Data Window and showing the column metadata in the Columns Window), the fact that the same widget is used provides a common look and feel.

Here are some of the things you can do with a JTable:

Scroll around
Using the scrollbars which may appear to the right and below the table you can scroll around it to see parts which are not initially visible. You can grab the sliders and drag them around by holding the mouse button down while you move it, or click in the slider "trough" one side or other of the current slider position to move a screenful. Under some circumstances the cursor arrow keys and PageUp/PageDown keys may move the table too. If the JTable is small enough to fit within the window the scrollbars may not appear.
Move columns
By clicking on the header (grey title bit at the top) of a column and dragging it left or right, you can change the order of columns as displayed. In some cases (the Data Window) this actually has the effect of changing the order of the columns in the table; in other cases it is just cosmetic.
Resize columns
By dragging on the line between row headers you can change the width of the columns in the table.
Edit cells
In some cases, cells are editable. If they are, then double-clicking in the cell will begin an edit session for that cell, and pressing Return will confirm that the edit has been made.
Select rows
Sometimes rows can be highlighted; you can select one row by clicking on it or a number of contiguous rows by clicking and dragging from the first to the last. To add further rows to a set already selected without deselecting the first lot, hold the "Control" key down while you do it.

In some cases where a JTable is displayed, there will be a menu on the menu bar named Display. This permits you to select which columns are visible and which are hidden. Clicking on the menu will give you a list of all the available columns in the table, each with a checkbox by it; click the box to toggle whether the column is displayed or not.


A.2 Control Window

The Control Window

The Control Window

The Control Window is the main window from which all of TOPCAT's activities are controlled. It lists the known tables, summarises their characteristics, and allows you to open other windows for more specialised tasks. When TOPCAT starts up you will see this window - it may or may not have some tables loaded into it according to how you invoked the program.

The window consists of two main parts: the Table List panel on the left, and the Current Table Properties panel on the right. Tables loaded into TOPCAT are shown in the Table List, each identified by an index number which never changes for a given table, and a label which is initially set from its location, but can be changed for convenience.

One of the tables in the list is highlighted, which means it is the currently selected table; you can change the selection by clicking on an item in the list. Information about the selected table is shown in the properties panel on the right. This shows such things as the number of rows and columns, current sort order, current row subset selection and so on. It contains some controls which allow you to change these properties. Additionally, many of the buttons in the toolbar relate to the currently selected table.

The Table List, Current Table Properties panel, and actions available from the Control Window's toolbar and menus are described in the following subsections.

A.2.1 Table List

The Table List panel on the left of the Control Window is pretty straightforward - it lists all the tables currently known to TOPCAT. If a new table is introduced by loading it from the Load Window or as a result of some action such as table joining then its name and number will appear in this list. The currently selected table is highlighted - clicking on a different table name (or using the arrow keys if the list has keyboard focus) will change the selection. The properties of the selected table are displayed in the Current Table Properties panel to its right, and a number of the toolbar buttons and menu items refer to it.

If you double-click on a table in the list, or press Return while it is selected, that table's Data Window will appear.

Certain other applications (Treeview, FROG, or even another instance of TOPCAT) can interoperate with TOPCAT using drag-and-drop, and for these the table list is a good place to drag/drop tables. For instance you can drag a table node off of the main panel of Treeview and drop it onto the table list of TOPCAT, and you will be able to use that table as if it had been loaded from disk. You can also paste the filename or URL of a table onto the table list, and it will be loaded.

A.2.2 Current Table Properties panel

The Current Table Properties panel on the right hand side of the Control Window contains a number of controls which relate to the currently selected table and its Apparent properties; they will be blank if no table is selected. Here is what the individual items mean:

Label
The short name associated with the selected table. It is used in the Table List panel and in labelling view windows so you can see which table they refer to. It usually set initially according to where the table came from, but you can change it by typing into the text field.
Location
The original source of the selected table. This is typically a filename or URL (perhaps abbreviated), but may be something else depending on where the table came from.
Name
A name associated with the selected table. This may be derived from the table's filename if it had one or from some naming string stored within the table metadata.
Rows
The number of rows in the selected table. If the current Row Subset does not include all the rows, then an indication of how many are visible within that subset will be given too.
Columns
The number of columns in the selected table. If some are currently hidden (not included in the current Column Set), an indication of how many are visible will be given too.
Sort Order
The column (if any) which determines the current Row Order. A selector shows the column (if any) on which the rows of the Apparent Table are sorted and allows you to choose a different one. The little arrow beside it indicates whether the sort is ascending or descending.
Row Subset
The name of the current Row Subset. A selector shows the name of the subset which determines which rows are part of the Apparent Table and allows you to choose another one. "All" indicates that all rows are included.
Activation Action
The currently selected Activation Action. The action can be changed by clicking on this button to display the Activation Window.

A.2.3 Toolbar Buttons

The following buttons deal with table import and export:

Load Table
Pops up the Load Table dialogue which allows you to load a table into TOPCAT. If a table is loaded it becomes the new current table.
Save Table
Pops up the Save Table dialogue which allows you to write out the current Apparent Table.
Duplicate Table
Adds a new copy of the current Apparent Table to the list of known tables. This is like loading in the current table again, except that its apparent characteristics become the basic characteristics of the copied one, so for instance whatever is the current row order becomes the natural order of the new one.

The following buttons display various views of the current table; these views are described in more details in Appendix A.3.

Data Window
Displays the table rows and columns in a scrollable viewer so you can see the cell contents themselves.
Parameters Window
Displays table "parameters", that is metadata which applies to the whole table.
Columns Window
Displays metadata about each column such as data type, units, description, UCDs etc.
Subsets Window
Displays the currently defined row subsets and enables new ones to be defined.
Statistics Window
Displays a window for calculating statistical quantities for the values in each column of the table.
Plot window
Permits plotting of columns against each other.

The following buttons deal with matching and joining tables (see Section 4 for discussion of these functions):

Concatenation Window
Displays a dialog for joining tables top-to-bottom.
Internal Match Window
Displays a dialog for finding internal matches between the rows of a table.
Pair Match Window
Displays a dialog for joining tables side-by-side by locating rows which match between them.

The following buttons are miscellaneous:

Available Functions Window
Displays a window containing all the functions which can be used for writing algebraic expressions (see Section 6).

A.2.4 Menu Items

This section describes actions available from the Control Window menus additional to those also available from the toolbar (described in the previous section) and those common to other windows (described in Appendix A.1.2).

The File menu contains the following additional actions:

Discard Table
Removes the current table from the list and closes and forgets any view windows associated with it. A discarded table cannot be reinstated. You will be prompted to confirm this action. Discarding a table in this way may free up memory, for other operations, but often will not; whether it does or not depends on the details of where the table comes from.
Export To Mirage
Starts up the external Mirage application on the current apparent table. This action is only available if Mirage is on your classpath. See Section 3.2.9.
View Log
Pops up the Log Window to display logging messages generated by the application. Intended mainly for debugging.

The Windows menu contains actions for controlling which table view windows are currently visible on the screen. If you have lots of tables and are using various different views of several of them, the number of windows on the screen can get out of hand and it's easy to lose track of what window is where. The actions on this menu do some combination of either hiding or revealing all the various view windows associated with either the selected table or all the other ones. Windows hidden are removed from the screen but if reactivated (e.g. by using the appropriate toolbar button) will come back in the same place and the same state. Revealing all the windows associated with a given table means showing all the view windows which have been opened before (it won't display windows which have never explicitly been opened).

Show Selected Views Only
Reveal all view windows associated with the currently selected table and hide all others.
Show Selected Views
Reveal all view windows which are associated with the currently selected table.
Show All Views
Reveal all view windows associated with all tables.
Hide Unselected Views
Hide all view windows associated with tables other than the currently selected one.
Hide Selected Views
Hide all view windows associated with the currently selected table.
Hide All Views
Hide all the view windows. If you get really confused, this is a good one to select to clear up your screen prior to reinstating the ones that you actually want to look at.
Note that the Control Window item () on menus on all other windows is also useful for window management - it brings back the control window if it's been hidden.

The Joins menu, as well as containing the actions for table concatenation, internal matching and pair matching which are available from the toolbar, also gives you the option to join three or four tables at once by matching rows. The multi-table match windows work pretty much the same as the Pair Matching Window, but with more tables.


A.3 Table View Windows

Many of the windows you will see within TOPCAT display information about a single table. There are several of these, each displaying a different aspect of the table data - cell contents, statistics, column metadata, plotted values etc. There is one of each type for each of the tables currently loaded, though they won't necessarily all be displayed at once. The title bar of these windows will say something like TOPCAT(3): Table Columns, which indicates that it is displaying information about the column metadata for the table labelled "3:" in the Control Window.

To open any of these windows, select the table of interest in the Control Window and click the appropriate toolbar button (or the equivalent item in the Table Views menu). This will either open up a new window of the sort you have requested, or if you have opened it before, will make sure it's visible.

If you have lots of tables and are using various different views of several of them, the number of windows on the screen can get out of hand and it's easy to lose track of what window is where. In this case the Control Window's Windows menu (described in Appendix A.2.4), or the File|Control Window menu item in any of the view windows can be handy to keep them under control.

The following sections describe each of these table view windows in turn.

A.3.1 Data Window

Data Window

Data Window

The Data Window presents a JTable containing the actual cells of the Apparent Table. You can display it using the Table Data () button when the chosen table is selected in the Control Window's Table List.

You can scroll around the table in the usual way. In most cases you can edit cells by double-clicking in them, though some cells (e.g. ones containing arrays rather than scalars) cannot currently be edited. If it looks like an edit has taken place, it has.

There is a grey column of numbers on the left of the JTable which gives the row index of each row. This is the value of the special Index column, which numbers each row of the original (not apparent) table starting at 1. If the table has been sorted these numbers may not be in order.

Note that reordering the columns by dragging their headings around will change the order of columns in the table's Column Set and hence the Apparent Table.

If you have table with very many columns it can be difficult to scroll the display sideways so that a column you are interested in is in view. In this case, you can go to the Columns Window and click on the description of the column you are after in the display there. This will have the effect of scrolling the Data Window sideways so that your selected column is visible in the centre of the display here.

The following buttons are available in the toolbar:

Subset From Selected Rows
Defines a new Row Subset consisting of those rows which are currently highlighted. You can highlight a contiguous group of rows by dragging the mouse over them; further contiguous groups can be added by holding the Control key down while dragging. This action is only available when some rows have been selected.
Subset From Unselected Rows
Defines a new Row Subset consisting of those rows which are visible but currently not highlighted. You can highlight a contiguous group of rows by dragging the mouse over them; further contiguous groups can be added by holding the Control key down while dragging. This action is only available when some rows have been selected.

As well as the normal menu, right-clicking over one of the columns in the displayed table will present a Column Popup Menu, which provides a convenient way to do some things with the column in question:

Replace Column
Pops up a Synthetic Column dialogue to replace this column with a new synthetic one. The dialogue is initialised with the same name, units etc as the selected column, and with an expression that evaluates to its value. You can alter any of these, and the new column will replace the old one, which will be hidden and renamed by appending a suffix like "_old" to its name.
New Synthetic Column
Pops up a Synthetic Column dialogue to insert a new synthetic column just after this one.
Sort up
Sorts the table rows according to ascending value of the contents of the column. Only available if some kind of order (e.g. numeric or alphabetic) can sensibly be applied to the column.
Sort down
Sorts the table rows according to descending value of the contents of the column. Only available if some kind of order (e.g. numeric or alphabetic) can sensibly be applied to the column.
Hide
Hides the column. It can be reinstated from the Columns window.
Search Column
For string-valued columns, this option allows you to search for values in a column. If you select it you will be asked to enter a regular expression, and then any row which matches that expression in this column will be selected (highlighted). If there's just one matching column it will be activated as well. The expression obeys normal regular expression syntax, so for instance you'd enter ".*XYZ.*" to find all rows which contain the string "XYZ".

A.3.2 Parameters Window

Parameters Window

Parameters Window

The Parameters Window displays metadata which applies to the whole table (rather than that for each column). You can display it using the Table Parameters () button when the chosen table is selected in the Control Window's Table List.

In table/database parlance, an item of per-table metadata is often known as a "parameter" of the table. The number of rows and columns will always be listed; some table file formats don't have facilities for storing other table parameter metadata, so there may not be much of interest displayed in this window.

The display is a JTable with one row for each parameter. It indicates the parameter's name, its value, the type of item it is (integer, string etc) and other items of interest such as units, dimensionality or UCD if they are defined. If a column of the table has no entries (for instance, the Units column might be empty because none of the parameters has had units defined for it) then that column may be absent from the display - in this case the Display menu can be used to reveal it.

You can edit some parameter values and descriptions by double-clicking on them as usual.

The following items are available in the toolbar:

Add Parameter
Pops up a New Parameter Window to allow you to add a new parameter to the table.
Remove Parameter
If one of the parameters displayed in the JTable in this window has been selected by clicking on its row, then clicking this button will remove it. You will be prompted before the removal takes place. Some parameters such as Row Count cannot be removed.

A.3.3 Columns Window

Columns Window

Columns Window

The Columns Window displays a JTable giving all the information (metadata) known about each column in the table. You can display it using the Column Info () button when the chosen table is selected in the Control Window's Table List.

The display may take a little bit of getting used to, since each column in the main data table is represented by a row in the JTable displayed here. The order and widths of the columns of JTable widget can be changed in the same way as those for the Data Window JTable, but this has no effect on the data.

The leftmost column, labelled "Visible", contains a checkbox in each row (one for each column of the data table). Initially, these are all ticked. By clicking on those boxes, you can toggle them between ticked and unticked. When unticked, the column in question will become hidden. The row can still be seen in this window, but the corresponding data column is no longer a part of the Apparent Table, so will not be seen in the Data Window or appear in exported versions of the table. You can tick/untick multiple columns at once by highlighting a set of rows by dragging the mouse over them and then using the Hide Selected () or Reveal Selected () toolbar buttons or menu items.

Each column in the displayed JTable corresponds to one piece of information for each of the columns in the data table - column name, description, UCD etc. Tables of different types (e.g. ones read from different input formats) can have different categories of metadata. By default a metadata category is displayed in this JTable if at least one table column has a non-blank value for that metadata category, so for instance if no table columns have a defined UCD then the UCD column will not appear. Categories can be made to appear and disappear however by using the Display menu. The metadata items are as follows:

Visible
Indicates whether the column is part of the Apparent Table. If this box is not filled in, then for most purposes the column will be hidden from display. You can toggle visibility by clicking on this column.
Name
The name of the column.
$ID
A unique and unchanging ID value for each column. These are useful in defining algebraic expressions (see Section 6) since they are guaranteed unique for each column. Although the column Name can be used as well, the Name may not be unique and may not have the correct form for use in an algebraic expression.
Class
The Java class of the items in that column. You don't have to know very much Java to understand these; they are Float or Double for floating point numbers; Byte, Short, Integer or Long for integer numbers, Boolean for a logical (true/false) flag, or String for a string of ASCII or Unicode characters. There are other possibilities, but these will cover most. The characters '[]' after the name of the class indicates that each cell in the column holds an array of the indicated type.
Shape
Cells of a table can contain arrays as well as scalars. If the column contains an array type, this indicates the shape that it should be interpreted as. It gives the dimensions in column-major order. The last element may be a '*' to indicate that the size of the array may be variable. For scalar columns, this item will be blank.
Units
The units in which quantities in this column are expressed.
Expression
The algebraic expression defining the values in this column. This will only be filled in if the column in question is a synthetic column which you have added, rather than one present in the data in their original loaded form.
Description
A textual description of the function of this column.
UCD
The UCD associated with this column, if one is specified. UCDs are Uniform Content Descriptors, and indicate the semantics of the values in this column.
UCD Description
If the string in the UCD column is the identifier of a known UCD, the standard description associated with that UCD is shown here.
There may be other items in the list specific to the table in question.

You can edit column names and some other entries in this JTable by double-clicking on them as usual.

The order in which the rows are presented is determined by the table's current Column Set, so can be changed by dragging the column headers around in the Data Window.

The following buttons are available in the toolbar:

New Synthetic Column
This pops up a Synthetic Column Window which allows you to define a new column in terms of the existing ones by writing an algebraic expression. The new column will be added by default after the last selected column, or at the end if none is selected.
Add Sky Coordinate Columns
This pops up a Sky Coordinates Window which allows you to define a pair of new sky coordinate columns based on an existing pair of sky coordinate columns.
Replace Column With Synthetic
If a single column is selected, then clicking this button will pop up a Synthetic Column dialogue to replace the selected column with a new synthetic one. The dialogue is initialised with the same name, units etc as the selected column, and with an expression that evaluates to its value. You can alter any of these, and the new column will replace the old one, which will be hidden and renamed by appending a suffix like "_old" to its name.
Hide Selected Column(s)
If any of the columns are selected, then clicking this button will hide them, that is, remove them from the current Column Set. This has the same effect as deselecting all the checkboxes corresponding to these columns in the Visible column.
Reveal Selected Column(s)
If any of the columns are selected, then clicking this button will make sure they are visible, that is, that they appear in the current Column Set. This has the same effect as selecting all the checkboxes corresponding to these columns in the Visible column.
Explode Array Column
If a column is selected which has an array type, clicking this button will replace it with scalar-valued columns containing each of its elements. For instance if a column PMAG contains a 5-element vector of type float[] representing magnitudes in 5 different bands, then selecting it and hitting this button will hide PMAG and insert 5 new Float-type columns PMAG_1...PMAG_5 in its place each containing one of the magnitudes.
Sort Selected Up
If a single column is selected then the table's current Sort Order will be set to sort ascending on that column. Otherwise this action is not available.
Sort Selected Down
If a single column is selected then the table's current Sort Order will be set to sort descending on that column. Otherwise this action is not available.

Several of these actions operate on the currently selected column or columns. You can select columns by clicking on the corresponding row in the displayed JTable as usual. A side effect of selecting a single column is that the table view in the Data Window will be scrolled sideways so that the selected column is visible in (approximately) the middle of the screen. This can be a boon if you are dealing with a table that contains a large number of columns.

A.3.4 Subsets Window

Subsets Window

Subsets Window

The Subsets Window displays the Row Subsets which have been defined. You can display it using the Row Subsets () button when the chosen table is selected in the Control Window's Table List.

The subsets are displayed in a JTable widget with a row for each subset. The columns of the JTable are as follows:

_ID
A unique and unchanging identifier for the subset, which consists of a "_" character (underscore) followed by an integer. This can be used to refer to it in expressions for synthetic columns or other subsets.

Note: in previous versions of TOPCAT the hash sign ("#") was used instead of the underscore for this purpose; the hash sign no longer has this meaning.

Name
A name used to identify the subset. It is ideally, but not necessarily, unique.
Size
The number of rows in this subset. This column is initially blank, and is not guaranteed to remain correct if the subset definitions or the table data change, since counting may be an expensive process so it is not automatically done with every change. A count can be forced by using the Count Subsets () button described below.
Expression
If the subset has been defined by an algebraic expression, this will be here. It can be edited (double-click on the cell) to change the expression.
Column $ID
If the subset has been defined by equivalence with a boolean-valued column, this will show the $ID of the column that it came from (see Appendix A.3.3).

Entries in the Name and Expression columns can be edited by double-clicking on them in the normal way.

The following toolbar buttons are available in this window:

New Subset
Pops up the Algebraic Subset Window to allow you to define a new subset algebraically.
Invert Subset
Creates a new subset which is the complement of the selected one. The new one will include all the rows which are excluded by the selected one (and vice versa). To use this action, first select a subset by clicking on its row in the JTable.
To Column
If one of the rows in the JTable is selected, this will turn that subset into a new column. It will pop up the Synthetic Column Window, filled in appropriately to add a new boolean column to the table based on the selected subset. You can either accept it as is, or modify some of the fields. To use this action, first select a subset by clicking on its row in the JTable.
Count Subsets
Counts how many rows are in each subset and displays this in the Size column. This forces a count or recount to fill in or update these values.

A.3.5 Statistics Window

Statistics Window

Statistics Window

The Statistics Window shows statistics for the values in each of the table's columns. You can display it using the Column Statistics () button when the chosen table is selected in the Control Window's Table List.

The calculated values are displayed in a JTable widget with a row for each column in the main table, and a column for each of a number of statistical quantities calculated on some or all of the values in the data table column corresponding to that grid row. The following columns are shown by default:

Name
The name of the column in the main table represented by this grid row.
Mean
The mean value of the good cells. For boolean columns, this is the proportion of good cells which are True.
S.D.
The standard deviation of the good cells.
Minimum
The minimum value. For numeric columns the meaning of this is quite obvious. For other columns, if an ordering can be reasonably defined on them, the 'smallest' value may be shown. For instance string values will show the entry which would be first alphabetically.
Maximum
As minimum, but shows the largest values.
Good cells
The number of non-blank cells.
Several additional items of statistical information are also calculated, but the columns displaying these are hidden by default to avoid clutter. You can reveal these by using the Display menu:
Index
The index of the column in the table, i.e. the order in which it is displayed.
$ID
The unique identifier label for the column in the main table.
Sum
The sum of all the values in the column. For boolean columns this is a count of the number of True values in the column.
Variance
The variance of the good cells.
Row of min.
The index of the row in the main table at which the minimum value occurred.
Row of max.
The index of the row in the main table at which the maximum value occurred.
Bad cells
The number of blank cells; the sum of this value and the Good cells value will be the same for each column.
Cardinality
If the column contains a small number of distinct values then that number, the column's cardinality will be shown here. Cardinality is the number of distinct values which appear in that column. If the number of values represented is large (currently >50) or a large proportion of the non-bad values (currently >75%) then no value is shown.

The quantities displayed in this window are not necessarily those for the entire table; they are those for a particular Row Subset. At the bottom of the window is the Subset For Calculations selector, which allows you to choose which subset you want the calculations to be done for. By clicking on this you can calculate the statistics for different subsets. When the window is first opened, or when it is invoked from a menu or the toolbar in the Control Window, the subset will correspond to the current row subset.

The toolbar contains the following extra button:

Recalculate
Once statistics have been calculated for a given subset they are cached and not normally recalculated again. Use this button if you want to force a recalculation because the data may have changed.

For a large table the calculations may take a short while. While they are being performed you can interact with the window as normal, but a progress bar is shown at the bottom of the window. If you initiate a new calculation (by pushing the Recalculate button or selecting a new subset) or close the window during a calculation, the superceded calculation will be stopped.

A.3.6 Plot Window

Plot Window

Plot Window

The plot window allows you to plot the values in two table columns against each other. You can display it using the Plot () button when the chosen table is selected in the Control Window's Table List.

On the plotting surface a marker is plotted for each row in the selected Row Subsets at a position determined by the values in the table columns selected to provide the X and Y values. A marker will only be plotted if both the X and Y values are not blank. If more than one subset is being plotted, they will be drawn using different markers. A key on the right hand side indicates the marker being used for each subset. The marker types can be changed using the Marker Types menu.

You can zoom in and out of the plot by dragging with the left mouse button down and right (zoom in) or up and left (zoom out) - this takes a little practice but is easy to use after a couple of goes. If you get lost you can push the Rescale button () to return the scaling to normal.

Below the plot there are two sets of controls for selecting the table column which will provide the X and Y axis values. Each one consists of two parts:

Column selector
A selection box from which column names in the main table can be selected. Only columns which can be plotted from (i.e. scalar numeric ones) will be displayed in this selector.
Log checkbox
This checkbox can be clicked to toggle whether the axis in question is to be plotted logarithmically or not. If it is logarithmic, any negative values are simply ignored (not plotted).
Flip checkbox
This checkbox can be clicked to toggle whether the axis in question is to be plotted reversed. Normally (unticked) X axis increases left to right and Y axis increases bottom to top. When ticked, X axis increases right to left, and Y axis increases top to bottom.

To the right is a set of checkboxes headed Row Subsets. Click on these to choose which of the table's defined Row Subsets should be plotted on this graph. Different subsets are plotted using different markers, so you can see where different groups of results lie in relation to each other. You can alternatively use the Subsets To Plot item on the Subsets menu. The subsets are plotted in order of which was most recently selected. This makes a difference on a crowded plot or where some points are members of multiple subsets, since the most recently plotted symbol will appear on top. If points from one subset are being hidden behind those from another, you can deselect and reselect that subset and they'll be shown on top.

The following extra buttons are available on the toolbar:

Export as EPS
Pops up a dialogue which will print the current plot as an EPS file. In general this is a faithful and high quality rendering of what is displayed in the plot window. However, if plotting is being done using the transparent markers, it won't come out right since transparency cannot be represented in PostScript; the markers will be rendered as if they were opaque. Currently, if there are many points being plotted, this can result in a rather large output file.
Export as GIF
Pops up a dialogue which will output the current plot to a GIF file. The output file is just the same as the plotted image that you see. Resize the plotting window before the export to control the size of the output GIF.
Rescale
Rescales the axes of the current plot so that it contains all the data points in the currently selected subsets. By default the plot will be scaled like this, but it it may have changed because of changes in the subset selection or from zooming in or out.
Replot
Redraws the current plot. It is usually not necessary to use this button, since if you change any of the plot characteristics with the controls in this window the plot will be redrawn automatically. However if you have changed the data, e.g. by editing cells in the Data Window, the plot is not automatically redrawn (since this is potentially an expensive operation and you may not require it). Clicking this button redraws the plot taking account of any changes to the table data.
Grid
Toggles whether a grid is drawn over the plotting surface or not.
Draw Subset Region
Allows you to draw a region on the screen defining a new Row Subset. When you have finished drawing it, click this button again to indicate you're done. See Appendix A.3.6.1 for more details.
Subset From Visible
Defines a new Row Subset consisting of only the points which are currently visible on the plotting surface. See Appendix A.3.6.1 for more details.

The Marker Types menu allows you to select a set of markers which will be used for plotting. Some of these sets are marked "Transparent" - for these, instead of pixels on the plot blocking out ones already plotted, the more markers that are plotted at a given screen position, the darker in colour it will appear. This can be useful if you have very many points to plot, since you can see by the colour of pixels on the plot how many points are there in crowded regions. Unfortunately transparent points are not rendered properly when exported to PostScript files (they come out opaque), but they still work when exported to GIF format. The marker type set used initially depends on how many rows there are in the table (large dots for few rows, small ones for many).

The Regression menu provides facilities for calculating and plotting linear regression lines for some or all of the subsets on display. The following options appear on the menu:

Plot Regression For Subsets...
Presents you with a checkbox menu which allows you to select which subsets regression lines will be displayed for. Each subset whose box you select will have a line plotted on the graph representing its least-squares fit regression line in the same colour as the points for that subset.
Display Regression Coefficients
Displays a window giving the gradient, intercept and product moment correlation coefficient for each regression line which has been plotted.
Note the regression lines plotted are those calculated from all the visible points in the subset in question - any points off the edge of the graph are disregarded. Thus, zooming in and out will change the correlation coefficients and line geometry. The linear correlation functionality is experimental in this version of the program, and will be improved in future.

A.3.6.1 Defining Subsets From Plots

When columns are plotted against each other in the Plot Window, it becomes easy to see groupings of the data which may not be otherwise apparent; a cluster of (X,Y) points representing a group of rows may correspond to a physically important grouping of objects which you would like to treat separately elsewhere in the program, for instance by calculating statistics on just these rows, writing them out to a new table, or plotting them in a different colour on graphs with different coordinates. This is easily accomplished by creating a new Row Subset containing the grouped points, and the Plot Window gives you two ways to do this.

The simplest way is to zoom the plot so that only the points you want to identify are visible (by dragging the mouse down-and-right to zoom in or up-and-left to zoom out) and hitting the New Subset From Visible () toolbar button. This defines a subset consisting of all the points that are currently visible. This has the limitation that only a rectangular grouping of points can be selected.

A much more flexible way is to draw a region or regions on the plot which identify the points you are interested in. To do this, hit the Draw Subset Region () toolbar button. Having done this, you can drag the mouse around on the plot (keep the left mouse button down while you move) to encircle the points that you're interested in. As you do so, a translucent grey blob will be left behind - anything inside the blob will end up in the subset. You can draw one or many blobs, which may be overlapping or not. If you make a mistake while drawing a sequence of blobs, you can click the right mouse button, and the most recently added blob will disappear. When you're in this region-drawing mode, you can't zoom or resize the window or change the characteristics of the plot, and the Draw Subset Region button appears with a tick over it () to remind you you're in it. Here's what the plot looks like while you're drawing:

Region-Drawing Mode

Region-Drawing Mode

When you're happy with the region you've defined, click the toolbar button again.

In either case, when you have indicated that you want to define a new row subset, a dialogue box will pop up to ask you its name. As described in Section 2.1.1, it's a good idea to use a name which you haven't used before, and which is just composed of letters, numbers and underscores. When you enter a name and hit the OK button, the new subset will be created and the points in it will be shown straight away on the plot using a new symbol. As usual, you can toggle whether the points in this subset are displayed using the Row Subsets box at the bottom of the Plot Window.


A.4 Load Window

Load Window

Load Window

The Load Window is used for loading tables from an external location (e.g. disk or URL) into TOPCAT. It is obtained using the Load Table button () in the Control Window toolbar or File menu.

This dialogue allows you to specify a new table to open in several different ways, described below. If you successfully load a table using any of these options, a new entry will be added into the Table List in the Control Window, which you can then use in the usual ways. If you choose a location which can't be turned into a table (for instance because the file doesn't exist), a window will pop up telling you what went wrong. If you get an OutOfMemoryError while loading a table, you will have to run TOPCAT with more memory, as described in Section 7.2.2 or use the -disk flag described in Section 7.1.

In the simplest case, you can type a name into the Location field and hit return or the OK button. This location can be a filename or a URL, possibly followed by a '#' character and a 'fragment identifier' to indicate where in the file or URL the table is located; the details of what such fragment identifiers mean can be found in the relevant subsection within Section 3.1. You should select the relevant table format from the Format selector box - you can leave it on (auto) for loading FITS tables or VOTables, but for other formats such as ASCII or CSV you must select the right one explicitly (again, see Section 3.1 for details).

There are many other ways of loading tables however, described in the following subsections. The Filestore Browser button is always visible below the location field. Depending on startup options, there may be other buttons here. In any case, you can look in the DataSources menu to see other table load dialogues. Exactly which ones are available will depend on your setup (some may be absent or greyed out, and additional ones may be available). The following subsections describe some of the options which may be available.

A.4.1 Filestore Browser

Filestore Browser window

Filestore Browser window

By clicking the Filestore Browser button in the Load Window, you can obtain a file browser which will display the files in a given directory. The way this window works is almost certainly familiar to you from other applications.

Unlike a standard file browser however, it can also browse files in remote filestores: currently supported are MySpace and SRB. MySpace is a distributed storage system developed for use with the Virtual Observatory by the AstroGrid project, and SRB (Storage Resource Broker) is a similar general purpose system developed at SDSC. To make use of these facilities, select the relevant entry from the selector box at the top of the window as illustrated above; this will show you a Log In button which prompts you for username, password etc, and you will then be able to browse the remote filestore as if it were local. The same button can be used to log out when you are finished, but the session will be logged out automatically when TOPCAT ends in any case. Access to remote filesystems is dependent on certain optional components of TOPCAT, and it may not be available if you have the topcat-lite configuration.

The browser initially displays the current directory, but this can be changed by typing a new directory into the File Name field, or moving up the directory hierarchy using the selector box at the top, or navigating the file system by clicking the up-directory button or double-clicking on displayed directories.

All files are shown, and there is no indication of which ones represent tables and which do not. To open one of the displayed files as a table, double-click on it or select it by clicking once and click the Open Table button. The Table Format selector must be set correctly: the "(auto)" setting will automatically detect the format of VOTable or FITS tables, otherwise you will need to select the option describing the format of the file you are attempting to load (see Section 3.1). If you pick a file which cannot be converted into a table an error window will pop up.

Because this browser only works at the file level, there is a limit to what tables it can access. For instance if you select a FITS file, the table opened will correspond to the first TABLE or BINTABLE HDU within it. For a more table-aware view of the file system, use the Hierarchy Browser instead.

A.4.2 Hierarchy Browser

File load Hierarchy Browser window

File load Hierarchy Browser window

By selecting the Hierarchy Browser option from the Load Window's DataSources menu, you can obtain a browser which presents a table-aware hierarchical view of the file system. (Note that a freestanding version of this panel with additional functionality is available in the separate Treeview application).

This browser resembles the Filestore Browser in some ways, but with important differences:

The main part of the window shows a "tree" representation of the hierarchy, initially rooted at the current directory. Each line displayed represents a "node" which may be a file or some other type of item (for instance an HDU in a FITS file or an entry in a tar archive). The line contains a little icon which indicates what kind of node it is and a short text string which gives its name and maybe some description. Nodes which represent tables are indicated by the icon. For nodes which have some internal structure there is also a "handle" which indicates whether they are collapsed () or expanded (). You can examine remote filespaces (MySpace, SRB) as well as local ones in the same way as with the Filestore Browser.

If you select a node by clicking on it, it will be highlighted and some additional description will appear in the panel below the hierarchy display. The text is in bold if the node in question can be opened as a table, and non-bold if it is some non-table item.

Note: an important restriction of this browser is that it will only pick up tables which can be identified automatically - this includes FITS and VOTable files, but does not include text-based formats such as ASCII and Comma-Separated Values. If you want to load one of the latter types of table, you will need to use one of the other load methods and specify table format explicitly.

You can see how this browser works on an example directory of tables as described in Appendix A.4.5.

Note that this window requires certain optional components of the TOPCAT installation, and will not be available if you have the topcat-lite configuration.

A.4.2.1 Navigation

Navigation is a bit different from navigation in the File Browser window. To expand a node and see its contents, click on its handle (clicking on the handle when it is expanded will collapse it again). When you have identified the table you want to open, highlight it by clicking on it, and then click the Open Table button at the bottom.

To move to a different directory, i.e. to change the root of the tree which is displayed, use one of the buttons above the tree display:

Selector box
Allows you to move straight to any directory higher up than the current one.
Up
Moves to the parent of the current directory.
Down
Moves to the currently selected (highlighted) node.
Home
Moves to the user's home directory.
Alternatively, you can type in a new directory in the Go to field at the bottom of the window.

(In fact the above navigation options are not restricted to changing the root to a new directory, they can move to any node in the tree, for instance a level in a Tar archive.)

A.4.2.2 Table Searches

There are two more buttons in the browser, Search Selected and Search Tree. These do a recursive search for tables in all the nodes starting at the currently selected one or the current root respectively. What this means is that the program will investigate the whole hierarchy looking for any items which can be used as tables. If it finds any it will open up the tree so that they are visible (note that this doesn't mean that the only nodes revealed will be tables, ancestors and siblings will be revealed too). This can be useful if you believe there are a few tables buried somewhere in a deep directory structure or Tar archive, but you're not sure where. Note that this may be time-consuming - a busy cursor is displayed while the search is going on. Changing the root of the tree will interrupt the search.

A.4.3 SQL Query

SQL Query Dialogue

SQL Query Dialogue

If you want to read a table from an SQL database, you can use a specialised dialogue to specify the SQL query by selecting SQL Query option from the Load Window's DataSources menu.

This provides you with a list of fields to fill in which make up the query, as follows:

Protocol
The name of the appropriate JDBC sub-protocol. This is defined by the JDBC driver that you are using, and is for instance "mysql" for MySQL's Connector/J driver or "postgresql" for PostgreSQL's JDBC driver.
Host
The hostname of the machine on which the database resides (may be "localhost" if the database is local).
Database name
The name of the database.
SQL Query
The text of the query which will define the resulting table. If you want to look at a table named XXX as it exists in the database, you can write something like "SELECT * from XXX". In principle any SQL query on the database can be used here, but the details of what SQL syntax is permitted will be defined by the JDBC driver you are using.
User name
The username under which you wish to access the database. This is not strictly necessary if there is no access control for the database in question.
Password
The password for the given username. Again, whether this is necessary depends on the access policy of the database.

There are a number of criteria which must be satisfied for SQL access to work within TOPCAT (installation of appropriate drivers and so on) - see Section 7.3. If you don't take these steps, this dialogue may be inaccessible.

A.4.4 Cone Search

Cone search table import dialogue

Cone search table import dialogue

By selecting the Cone Search option from the Load Window's DataSources menu, you can obtain a dialogue which allows you to query one of a number of external web services for a catalogue of objects known in a given region of the sky.

When first displayed, this dialogue window will ask an external services registry for all the cone search services on the net which have advertised their existence. When it has got the result, you will see a list of their names and titles in a table. For more information about each one, use the Columns menu to select what information, such as publisher, reference URL etc is displayed in the table. You can scroll up and down this table and select the one which you want to query by clicking on it.

Having selected one of the cone search services from the table, you need to specify the sky region in which you are interested. If you enter the name of an astronomical object into the Object Name field and hit the Resolve button, the coordinates will be entered into the RA and Dec fields below. Alternatively you can type the coordinates in directly, choosing either degrees or sexagesimal coordinates using the unit selector boxes. Enter the search radius too.

Having done this, hit the OK button. This will send the query to the service you selected and, if successful, load into TOPCAT a table containing all the objects in the region of the sky you have specified. The exact format of the returned table will depend on the service you have selected, but it will contain at least columns representing Right Ascension and Declination.

Note that this window requires certain optional components of the TOPCAT installation, and will not be available if you have the topcat-lite configuration.

A.4.5 Example Tables

Provided with TOPCAT are some example tables, which you can access in a number of ways. The simplest thing is to start up TOPCAT with the "-demo" flag on the command line, which will cause the program to start up with a few demonstration tables already loaded in.

You can also load examples in from the Examples menu in the Load Window however. This contains the following options:

Load Example Table
Loads in a single example table.
Browse Demo Data
Pops up a Hierarchy Browser looking at a hierarchy of tables in different formats. This option is designed to show some of the organisational complexity which TOPCAT can handle when browsing tables.

Note these examples are a bit of a mixed bag, and are not all that exemplary in nature. They are just present to allow you to play around with some of TOPCAT's features if you don't have any real data to hand.


A.5 Save Window

Save Window

Save Window

The Save Window is used to write tables out, and it is accessed using the Save Table button () in the Control Window's toolbar or File menu. Any table in the Control Window's table list can be written at any time; what is written is the Apparent Table corresponding to the currently selected table, which takes into account any modifications you have made to its data or appearance this session. The current Row Subset and Row Order are displayed in this window as a reminder of what you're about to save; if you modify the values in these selectors you will be modifying the Apparent Table in the usual way.

Any Row Subsets which have been defined on the table in the current session will not be saved themselves, but you can save information about subset membership by creating new boolean columns based on subsets using the "To Column" button () from the Subsets Window.

You can use the Table Output Format selector box to pick the format in which the table will be written from one of the supported output formats. There is no default format, and it won't automatically save to the same format it was loaded from, but if you leave it on "(auto)" it will try to guess the format based on the filename given; for instance if you specify the name "out.fits", a FITS binary table will be written.

You can specify the location of the output table in these ways, which are described in the following sections:

In some cases, saving the table to the same name as it was loaded from can cause problems (e.g. an application crash which loses the data unrecoverably). In other cases, it's perfectly OK. The main case in which it's problematic is when editing an uncompressed FITS binary table on disk. TOPCAT will attempt to warn you if it thinks you are doing something which could lead to trouble; ignore this at your own risk.

There is no option to compress files on output (though you can of course compress them yourself once they have been written).

If the table is large, a progress bar indicating how near the save is to completion will appear. It is not advisable to edit the table during a save operation.

In some cases, when saving a table to a format other than the one from which it was loaded, or if some new kinds of metadata have been added, it may not be possible to express all the data and metadata from the table in the new format. For instance a WDC table can contain data which represent epoch (date), and this cannot be stored in a FITS table. In this case the table may be written with such columns missing. Some message to this effect may be output in this case.

A.5.1 Enter Location

You can specify where to save a table by typing its location directly into the Output Location field of the Save Table window. This will usually be the name of a new file to write to, but could in principle be a URL or a SQL specifier.

A.5.2 Filestore Browser

Filestore Browser for table saving

Filestore Browser for table saving

By clicking the Browse Filestore button in the Save Table window, you can obtain a browser which will display the files in a given directory.

The browser initially displays the current directory, but this can be changed by typing a new directory into the File Name field, or moving up the directory hierarchy using the selector box at the top, or navigating the file system by clicking the up-directory button or double-clicking on displayed directories.

The browser can display files in remote filestores such as on MySpace or SRB servers; see the section on the load filestore browser (Appendix A.4.1) for details.

To save to an existing file, select the file name and click the OK button at the bottom; this will overwrite that file. To save to a new file, type it into the File Name field; this will save the table under that name into the directory which is displayed. You can (re)set the format in which the file will be written using the Output Format selector box on the right (see Section 3.2 for discussion of output formats).

A.5.3 SQL Output Dialogue

SQL table writing dialogue

SQL table writing dialogue

If you want to write a table to an SQL database, you can use a specialised dialogue to specify the table destination by clicking the SQL Table button in the Save Table window.

This provides you with a list of fields to fill in which define the new table to write, as follows:

Protocol
The name of the appropriate JDBC sub-protocol. This is defined by the JDBC driver that you are using, and is for instance "mysql" for MySQL's Connector/J driver or "postgresql" for PostgreSQL's JDBC driver.
Host
The hostname of the machine on which the database resides (may be "localhost" if the database is local).
Database name
The name of the database.
New table name
The name of a new table to write into the given database. Subject to user privileges, this will overwrite any existing table in the database which has the same name, so should be used with care.
User name
The username under which you wish to access the database. This is not strictly necessary if there is no access control for the database in question.
Password
The password for the given username. Again, whether this is necessary depends on the access policy of the database.

There are a number of criteria which must be satisfied for SQL access to work within TOPCAT (installation of appropriate drivers and so on) - see the section on JDBC configuration. If you don't take these steps, this dialogue may be inaccessible.


A.6 Concatenation Window

Concatenation Window

Concatenation Window

The Concatenation Window allows you to join two tables together top-to-bottom. It can be obtained using the Concatenate Tables button () in the Control Window toolbar or Joins menu.

When two windows are concatenated all the rows of the first ("base") table are followed by all the rows of the second ("appended") table. The result is a new table which has a number of rows equal to the sum of the two it has been made from. The columns in the resulting table are the same as those of the base table. To perform the concatenation, you have to specify which columns from the appended table correspond to which ones in the base table. Of course, this sort of operation only makes sense if at least some of the columns in both tables have the same meaning. This process is discussed in more detail in Section 4.1.

The concatenation window allows you to select the base and appended tables, and for each column in the base table to specify which column in the appended table corresponds to it. You may select a blank for this, in which case the column in question will have all null entries in the resulting table. In some cases these column selectors may have a value filled in automatically if the program thinks it can guess appropriate ones, but you should ensure that it has guessed correctly in this case. Only suitable columns are available for choosing from these column selectors; in most cases this means numeric ones.

When you have filled in the fields to your satisfaction, hit the Concatenate button at the bottom of the window, and a new table will be created and added to the table list in the Control Window (a popup window will inform you this has happened).

The result is created from the Apparent versions of the base and appended tables, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output.


A.7 Pair Match Window

Pair Match Window

Pair Match Window

The Pair Match Window allows you to join two tables together side-by-side, aligning rows by matching values in some of their columns between the tables. It can be obtained using the Pair Match () button in the Control Window toolbar or Joins menu.

In a typical scenario you might have two tables each representing a catalogue of astronomical objects, and you want a third table with one row for each object which has an entry in both of the original tables. An object is defined as being the same one in both tables if the co-ordinates in both rows are "similar", for instance if the difference between the positions indicated by RA and Dec columns differ by no more than a specified angle on the sky. Matching rows to produce the join requires you to specify the criteria for rows in both tables to refer to the same object and what to do when one is found - the options are discussed in more detail in Section 4.2.

The result is created from the Apparent versions of the tables being joined, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output. Progress information on the match, which may take a little while, is provided in the logging window and by a progress bar at the bottom of the window. When it is completed, you will be informed by a popup window which indicates that a new table has been created. This table will be added to the list in the Control Window and can be examined, manipulated and saved like any other. In some cases, some additional columns will be added to the output table which give you more information about how it has progressed (see Appendix A.7.3.

The Match Window provides a set of controls which allow you to choose how the match is done and what the results will look like. It consists of these main parts:

Match Criteria box
Allows you to define what counts as a match between two rows.
Column Selection boxes
Allows you to select which tables are to be joined and which columns in them supply the matching coordinates.
Output Rows selector
Allows selection of which rows are to be included in the output table (for instance whether only those rows matching in both tables should be output or not).
Log window
Reports on progress as the match is taking place. The progress bar at the bottom of the window also provides an indication of how far through each stage processing has got.
Control buttons
The Go button starts the search when you are happy with the selections that you have made, and the Stop button interrupts it midway if you decide you no longer want the results (closing the Match Window also interrupts the calculation).

The following sections describe some of these components in more detail.

A.7.1 Match Criteria

The match criteria box allows you to specify what counts as a match between two rows. The selection you make in this box will determine which columns you have to fill in for the table(s) being matched in the rest of the window. In most cases what you are selecting here is the coordinate space in which rows will be compared against each other, and a numerical value or values to determine how close two rows have to be in terms of a metric on that space to count as a match.

The following match types are offered:

Sky
Comparison of positions on the celestial sphere. In this case you will need to specify columns giving Right Ascension and Declination for each table participating in the match. The Max Error value you must fill in is the maximum separation of matched points around a great circle.
Sky with Errors
The matching is like that for the Sky option above, but an error radius (positional uncertainty) can be given for each row in the input tables, rather than just a single value for the whole match. You need to specify a single Max Error value, which gives the global maximum separation applying to all matches, and for each of the input tables, along with the Right Ascension and Declination columns, you also specify an Error column which gives the error radius corresponding to that position. Two rows are considered to match when the separation between the two RA,Dec positions is smaller than both the Max Error value and the sum of the two Error values for the corresponding rows. If either of the per-row Error values is blank, then any separation up to the Max Error is considered to match. According to these rules, you might decide to set the Max Error to an arbitarily large number so that only the sum of per-row Errors will determine the actual match criteria. However please don't do this, since the Max Error also functions as a tuning parameter for the matching algorithm, and ought to be reasonably close to the actual maximum acceptable separation - if necessary use the Statistics Window to determine the actual maximum uncertainty.
Sky 3D
Comparison of positions in the sky taking account of distance from the observer. In this case you will need to specify columns giving Right Ascension and Declination in angular units, as well as distance along the line of sight in arbitrary units for each table participating in the match. The Error value is a maximum separation in Cartesian space of matched points in the same units as the radial distance.
Exact Value
Requires exact matching of values. In this case you will need to specify the column containing the match key for each table participating in the match; this might typically be an object name or index number. Two rows count as matching if they have exactly the same entry in the specified field, except rows with a null value in that column, which don't match any other row.
N-dimensional Cartesian
Comparison of positions in an isotropic N-dimensional Cartesian space. In this case you will need to specify N columns giving coordinates for each table participating in the match. The Error value is the maximum spatial separation of matched points. Currently the highest dimensionality you can select is 3-d - does anyone want a higher number?
N-dimensional Cartesian (anisotropic)
Comparison of positions in an N-dimensional Cartesian space with an anisotropic metric. In this case you will need to specify N columns giving coordinates for each table participating in the match, and an error radius for each of these dimensions. Points P1 and P2 are considered to match if P2 falls within the ellipsoid defined by the error radii centered on P1. This kind of match will typically be used for non-'spatial' spaces, for instance (magnitude,redshift) space, in which the metrics in different dimensions are not related to each other. Currently the highest dimensionality you can select is 4-d - does anyone want a higher number?
Sky + X
Comparison of positions on the celestial sphere with an additional numeric constraint. This is a combination of the Sky and 1-d Cartesian matches above, so the columns you need to supply are RA, Dec and one extra, and the errors are angular separation on the sky and the error in the extra column. A match is registered if it matches in both of the constituent tests. You could use this for instance to match objects which are both close on the sky and have similar luminosities.
Sky + XY
Comparison of positions on the celestial sphere with an additional 2-d anisotropic Cartesian constraint. This is a combination of the Sky and 2-d Anisotropic Cartesian matches above, so the columns you need to supply are RA, Dec and two extra, and the errors are angular separation on the sky and the error radii corresponding to the extra columns. A match is registered if it matches in both of the constituent tests. You could use this for instance to match objects which are both close on the sky and have similar luminosities and redshifts.
HTM
Performs sky matching in just the same way as the Sky option above, but using a different algorithm (pixelisation of the celestial sphere is performed using the Hierarchical Triangular Mesh rather than the HEALPix scheme). The results in both cases should be identical, but HTM is much slower. Hence, this option is only useful for debugging. It may be withdrawn in future releases.

Depending on the match type, the units of the error value(s) you enter may be significant. In this case, there will be a unit selector displayed alongside the entry box. You must choose units which are correct for the number you enter.

A.7.2 Column Selection Boxes

The column selection boxes allow you to select which of the columns in the input tables will provide the data (the coordinates which have to match). For each table you must select the names of the required columns; the ones you need to select will depend on the match criteria you have chosen.

For some columns, such as Right Ascension and Declination in sky matches, units are important for the columns you select. In this case, there will be a selector box for the units alongside the selector box for the column itself. You must ensure that the correct units have been selected, or the results of the match will be rubbish.

In some cases these column and/or unit selectors may have a value filled in automatically (if the program thinks it can guess appropriate ones) but you should ensure that it has guessed correctly in this case. Only suitable columns are available for choosing from these column selectors; in most cases this means numeric ones.

A.7.3 Output Rows Selector Box

When the match is complete a new table will be created which contains rows determined by the matches which have taken place. The Output Rows selector box allows you to choose on what basis the rows will be included in the output table as a function of the matches that were found.

In all cases each row will refer to only one matched (or possibly unmatched) "object", so that any non-blank columns in a given row come from only rows in the input tables which match according to the specified criteria. However, you have two (somewhat interlinked) choices to make about which rows are produced.

The Match Selection selector allows you to choose what happens when a given row in one table can be matched by more than one row in the other table. There are two choices:

Best Match Only
Only the best match is chosen, and the other correct but inferior matches are ignored. The best match is usually the "closest" - it is the one with the lowest match score, where the definition of this score is determined by the match criteria you have selected.
All Matches
Any pairs which meet the match criteria are retained in the output table. This means that you may have data from some of the same input rows appearing more than once in the output.

The Join Type selector allows you to choose what output rows result from a match in the input tables.

1 and 2
The output table contains only rows which have an entry from both of the input tables, so that every output row represents an actual matched pair.
All from 1
All of the matched rows are present in the output as for 1 and 2, but additionally the unmatched rows from the first table are present with the columns from the second table blank.
All from 2
As for All from 1 but the other way round.
1 or 2
Every row, matched and unmatched, from both of the input tables appears in the output. This is the union of rows from All from 1 and All from 2.
1 not 2
This presents all the rows in the first table which do not have matches in the second table. Consequently, it only contains columns from the first table, since all the entries from the second one would be blank in any case.
2 not 1
The same as 1 not 2 but the other way round.
1 xor 2
The "exclusive or" of the match - the output only contains rows from the first table which don't have matches in the second table and vice versa. It is the union of 1 not 2 and 2 not 1.

In most cases (all the above except for 1 not 2 and 2 not 1, the set of columns in the output table contains all the columns from the first table followed by all the columns from the second table. If this causes a clash of column names, offending columns will be renamed with a trailing "_1" or "_2". Depending on the details of the match however, some additional useful columns may be added:

Match Score
For rows that represent a match, a numeric value representing how good the match was will usually be present. This is typically a separation in real or notional space - for instance for a Sky match it is the distance between the two matched celestial positions in arcseconds along a great circle. It will always be greater than or equal to zero, and a smaller value represents a better match. The name and exact meaning of this column depends on the match criteria - examine its description in the Columns Window for details.
GroupSize, GroupID
If you choose the All Matches option and some of the rows match more than once, two columns named GroupID and GroupSize will be added. These allow you to identify which matches are multiple. In the case of rows which represent a unique match, they are blank. But for rows which represent a set of multiple matches, the GroupSize value tells you how many rows participate in this match, and the GroupID value is an integer which is the same for all the rows which participate in the same match. So if you do a sort on the GroupID value, you'll see all the rows in the first non-unique match group together, followed by all the rows in the second non-unique group... and after them all the unique matches.

Here is an example. If your input tables are these:

      X          Y         Vmag
      -          -         ----
   1134.822    599.247     13.8
    659.68    1046.874     17.2
    909.613    543.293      9.3
and
     X           Y         Bmag
     -           -         ---- 
   909.523     543.800     10.1
   1832.114    409.567     12.3
   1135.201    600.100     14.6
    702.622   1004.972     19.0
then a Cartesian match of the two sets of X and Y values with an error of 1.0 using the 1 and 2 option would give you a result like this:
     X_1       Y_1         Vmag    X_2        Y_2         Bmag    Separation
     ---       ---         ----    ---        ---         ----    ----------
   1134.822    599.247     13.8   1135.201    600.100     14.6     0.933
    909.613    543.293      9.3    909.523    543.800     10.1     0.515
using All from 1 would give you this:
     X_1       Y_1         Vmag    X_2        Y_2         Bmag    Separation
     ---       ---         ----    ---        ---         ----    ----------
   1134.822    599.247     13.8    1135.201   600.100     14.6     0.933
    659.68    1046.874     17.2
    909.613    543.293      9.3     909.523   543.800     10.1     0.515
and 1 not 2 would give you this:
     X         Y           Vmag
     -         -           ----
    659.68    1046.874     17.2


A.8 Internal Match Window

Internal Match Window

Internal Match Window

The Internal Match Window allows you to perform matching between rows of the same table, grouping rows that have the same or similar values in specified columns and producing a new table as a result. It can be obtained by using the Internal Match () button in the Control Window toolbar or Joins menu.

You might want to use this functionality to remove all rows which refer to the same object from an object catalogue, or to ensure that only one entry exists for each object, or to identify groups of several "nearby" objects in some way.

The result is created from the Apparent versions of the tables being joined, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output. Progress information on the match, which may take some time, is provided in the logging window and by a progress bar at the bottom of the window. When it is completed, you will be informed by a popup window which indicates that a new table has been created. This table will be added to the list in the Control Window and can be examined, manipulated and saved like any other.

The window has the following parts:

Match Criteria box
Allows you to define what counts as a match between two rows (the same as for pair matching).
Column Selection box
Allows you to select which table to operate on and which columns supply the matching coordinates (the same as for pair matching).
Match Action box
Allows you to select what will be done (what new table will be created) when the matching groups of rows have been identified.
Log Window
Displays progress as the match is taking place. The progress bar at the bottom of the window also provides an indication of how far through each stage processing has got.
Control buttons
The Go button starts the search when you are happy with the selections that you have made, and the Stop button interrupts it midway if you decide you no longer want the results (closing the Match Window also interrupts the calculation).

A.8.1 Internal Match Action box

The Internal Match Action box gives a list of options for what will happen when an internal match calculation has completed. In each case a new table will be created as a result of the match. The options for what it will look like are these:

Mark Groups of Rows
The result is a table the same as the input table but with two additional columns: GroupID and GroupSize. Each group of rows which matched is assigned a unique integer, recorded in the GroupId column, and the size of each group is recorded in the GroupSize column. Rows which don't match any others (singles) have null values in both these columns. So for example by sorting the resulting table on GroupID you can group rows that match next to each other; or by sorting on GroupSize you can see all the pairs, followed by all the triples, ...

You can use this information in other ways, for instance if you create a new Row Subset using the expression "GroupSize == 5" you could select only those rows which form part of 5-object clusters.

Eliminate All Grouped Rows
The result is a new table containing only "single" rows, that is ones which don't match any other rows in the table according to the match criteria. Any rows which match are thrown out.
Eliminate All But First of Each Group
The result is a new table in which only one row (the first in the input table order) from each group of matching ones is retained. A subsequent internal match with the same criteria would therefore show no matches.
New Table With Groups of Size N
The result is a new "wide" table consisting of matched rows in the input table stacked next to each other. Only groups of exactly N rows in the input table are used to form the output table; each row of the output table consists of the columns of the first group member, followed by the columns of the second group member and so on. The output table therefore has N times as many columns as the input table. The column names in the new table have "_1", "_2", ... appended to them to avoid duplication.


A.9 Activation Window

Activation Window

Activation Window

The Activation Window allows you to configure an action to perform when a table row is activated by clicking on a row in the Data Window or a point in the Plot Window. It can be obtained by clicking the Activation Action selector at the bottom of the properties panel in the Control Window.

You have various options for how to define the action. On the left of the window is a list of options; you have to choose one of these to determine what kind of action will take place. When you click on one of these options the corresponding controls on the right hand side will become enabled: use these to select the details of the action and then click the OK button so that subsequent activation events will cause the action you have defined (or Cancel so that they won't). When you click OK the Activation Action in the control window will indicate the action you have configured.

The available options are as follows:

No Action
If this is selected, no special action will take place when a row is activated. This is the default.
Display Cutout Image
This option presents an easy-to-use way of popping up a cutout image from an image server displaying a region of sky around an activated row. You need to select the columns in your table which represent Right Ascension and Declination, including the units in which they are entered in the table (TOPCAT may be able to guess some or all of this information based on column names, UCDs and unit values, in which case it will enter its guesses in the selectors for you to accept or change). You also need to select the size in pixels of the image you want to see and the name of the survey which will supply the image from one of the listed ones: When you activate the row, the program will attempt to contact the web server which provides these images, retrieve the image, and display it in an image viewer window.
View URL as Image
This option is suitable if one of the columns in your table gives the location (filename or URL) of an image file. The image may be in FITS, GIF, JPEG or PNG format, optionally compressed using gzip, Unix compress, or bzip2 format. Select the column which contains the location, and activating a row will pop up an image viewer to display it. See Appendix A.9.1 for more information about image viewers.
View URL as Spectrum
This option is suitable if one of the columns in your table gives the location (filename or URL) of a spectrum file. The spectrum may be in FITS or text format, optionally compressed. Select the column which contains the location, and activating a row will try to pop up a spectrum viewer to display it. Note this may not be possible if some components are not installed - see Appendix A.9.2 for information about spectrum viewers.
View URL as Web Page
This option is suitable if one of the columns in your table gives the location (filename or URL) of a web page; this should normally be in HTML or plain text, but depending on what browser you use other kinds of document may be supported. Select the column which contains the location and the browser which you would like to use for display, and activating a row will try to pop up a browser window to display it. See Appendix A.9.3 for more information about browsers.
Execute Custom Code
This option must be used if none of the others (which are fairly restrictive) do what you want. It is highly flexible, but not so easy to use. What you have to do is to write an expression following the rules in Section 6 involving some of the column names which will be invoked when a row is activated. This expression will typically have the effect of popping up an image or a spectrum in a viewer, but, especially if you link in your own functions (see Section 6.8) it can do pretty much anything.

Functions which are expected to be useful for activation actions are described in Section 6.5.2 and include some general-purpose ones (displayImage and displaySpectrum to display an image or spectrum in an external viewer) as well as a few which are relevant to particular survey data, for instance the spectra2QZ() function, which will pop up a spectrum viewer displaying all the spectra related to a given row of 2QZ survey data based on the contents of its NAME column.

As the above list shows, most of the activation actions you can define result in a viewer window of some kind popping up. Exactly what kind of viewer is used depends on how TOPCAT is set up and in some cases on your choices. More details of the viewer programs available are given in the following subsections. If these don't do what you want, you can use the Execute Custom Code option, perhaps in conjunction with user-defined functions or the System exec() functions described in Section 6.5.2, to invoke your own.

A.9.1 Image Viewer Applications

If you choose the Display Cutout Image or View URL as Image option in the Activation Window, then activating a row will display an image in an image viewer.

The default image viewer is SoG, an astronomical image viewer based on JSky, which offers colourmap manipulation, image zooming, graphics overlays, and other features. For this to work JAI, otherwise known as Java Advanced Imaging must be installed. JAI is a free component available from Sun, but not a part of the Java 2 Standard Edition by default. In operation, SoG looks like this:

SoG Image Viewer

SoG Image Viewer

If JAI or the SoG classes themselves are absent, a fallback viewer which just displays the given image in a basic graphics window with no manipulation facilities is used. The fallback image viewer looks like this:

Fallback Image Viewer

Fallback Image Viewer

A.9.2 Spectrum Viewers

If you choose the View URL as Spectrum option in the Activation Window, then activating a row will display a spectrum in a spectrum viewer.

The default spectrum viewer is SPLAT, a sophisticated multi-spectrum analysis program. This requires the presence of a component named JNIAST, which may or may not have been installed with TOPCAT (it depends on some non-Java, i.e. platform-specific code). There is currently no fallback spectrum viewer, so if JNIAST is not present, then spectra cannot be displayed. In this case it will not be possible to select the Display Named Spectrum item in the Activation Window. An example of SPLAT display of multiple spectra is shown below.

SPLAT Spectrum Viewer

SPLAT Spectrum Viewer

Full documentation for SPLAT is available on-line within the program, or in SUN/243.

A.9.3 Web Browsers

If you choose the View URL as Web Page option in the Activation Window, then activating a row will display the web page whose URL is in one of the columns in a web browser. You are given the option of what browser you would like to use in this case.

The default basic browser option uses a simple browser which can view HTML or plain text pages and has forward and back buttons which work as you'd expect. In many cases this is fine for viewing HTML pages, and it is available regardless of the system that you are running TOPCAT on. It looks like this:

Basic HTML browser

Basic HTML browser

In some circumstances, it's possible to use your normal web browser for web page display instead. The list of browsers currently includes Firefox, Mozilla and Netscape as well as the basic one. Selecting these will generally only work if (1) the browser you select is installed and on your path, (2) you're on some Unix-like operating system, (3) the browser is already running when the action is invoked. In this case, the selected URL should be displayed in an existing browser window rather than opening a new one. Doing it this way has the advantage that your browser can probably display many types of document (perhaps using plugins) as well as HTML.


A.10 Help Window

Help Window

Help Window

The help window is a browser for displaying help information on TOPCAT. It views the text contained in this document, so it may be what you are looking at now. The panel on the left hand side gives a hierarchical view of the available help topics, and the panel on the right hand side displays the help text itself. The bar in between the two can be dragged with the mouse to affect the relative sizes of these windows.

The toolbar contains these extra buttons:

Back
Moves backward through the list of topics in the order you have looked at them.
Forward
Moves forward through the list of topics in the order you have looked at them.
Print
Pops up a dialogue to permit printing of the current page to a file or printer (but see below).
Page Setup
Pops up a dialogue to do printer setup.

Although the printing buttons work, if you want to print out the whole of this document rather than just a few sections you may be better off printing the PDF version, or printing the single-page HTML version through a web browser. The most recent version of these should be available on the web at http://www.starlink.ac.uk/topcat/sun253/sun253.html and http://www.starlink.ac.uk/topcat/sun253.pdf; you can also find the HTML version in the topcat jar file at uk/ac/starlink/topcat/help/sun253.html or, if you have a full TOPCAT installation, in docs/topcat/sun253/sun253.html and docs/topcat/sun253.pdf (the single-page HTML version is available here in the HTML version).

The help browser is an HTML browser and some of the hyperlinks in the help document point to locations outside of the help document itself. Selecting these links will go to the external documents. When the viewer is displaying an external document, its URL will be displayed in a line at the bottom of the window. You can cut and paste from this using your platform's usual mechanisms for this.


A.11 New Parameter Window

New Parameter dialogue window

New Parameter dialogue window

The New Parameter window allows you to enter a new table parameter to be added to a table. It can be obtained by clicking the New Parameter () button in the Appendix A.3.2. A parameter is simply a fixed value attached to a table and can contain information which is a string, a scalar, an array... in fact exactly the same sorts of values which can appear in table cells.

The window is pretty straightforward to use: fill in the fields and click OK to complete the addition. The Type selector allows you to select what kind of value you have input. The only compulsory field is Parameter Name; any of the others may be left blank, though you will usually want to fill in at least the Value field as well. Often, the parameter will have a string value, in which case the Units field is not very relevant.


A.12 Synthetic Column Window

Synthetic Column dialogue window

Synthetic Column dialogue window

The Synthetic Column Window allows you to define a new "Synthetic" column, that is one whose values are defined using an algebraic expression based on the values of other columns in the same row. The idea is that the value of the cells in a given row in this column will be calculated on demand as a function of the values of cells of other columns in that row. You can think of this as providing functionality like that of a column-oriented spreadsheet. You can activate the dialogue using the Add Column () or Replace Column () buttons in the Columns Window or from the (right-click) popup menu in the Data Window.

The window consists of a number of fields you must fill in to define the new column:

Name
The name of the new column. This should preferably be unique (different from all the other column names). It will be easier to use it in algebraic expressions if it is also:
Expression
This is the algebraic expression which defines the values that the cells in the new column of the table will have. The rules for writing algebraic expressions are described in Section 6, and detailed documentation of the functions you can use can be seen in the Available Functions Window, which you can see by clicking the Show Functions () button on the toolbar.
Units
The units of the column. If the quantity it represents is dimensionless or you don't know the units, this can be left blank. It would be a good idea to use a similar format for the units to that used for the existing columns in the table.
Description
A short textual description of what the values contained by this column are. May be left blank.
UCD
A Unified Content Descriptor for the column; a UCD is a semantic label attached to the column indicating what kind of quantity it contains by picking one option from a list defined by the CDS. The list of known UCDs is available via a selection box, or you can type a UCD in by hand. You may leave this blank if the you do not wish to assign a UCD to the column. A brief description of the UCD selected is visible below selection box itself.
Index
Determines the position in the displayed table at which the new column will initially appear.
Of these, the Expression is the only one which must be filled in.

Having filled in the form to your satisfaction, hit the OK button at the bottom and the new column will be added to the table. If you have made some mistake in filling in the fields, a popup window will give you a message describing the problem. This message may be a bit arcane - try not to panic and see if you can rephrase the expression in a way that the parser might be happier with. If you can't work out the problem, it's time to consult your friendly local Java programmer (failing that, your friendly local C programmer may be able to help) or, by all means, contact the author.

If you wish to add more metadata items you can edit the appropriate cells in the Columns Window. You can edit the expression of an existing synthetic column in the same way.

Once created, a synthetic column is added to the Apparent Table and behaves just like any other; it can be moved, hidden/revealed, used in expressions for other synthetic columns and so on. If the table is saved the new column and its contents will be written to the new output table.


A.13 Sky Coordinates Window

Sky Coordinates Window

Sky Coordinates Window

The Sky Coordinates Window allows you to add new columns to a table, representing coordinates in a chosen sky coordinate system. The table must already contain columns which represent sky coordinates; by describing the systems of the existing and of the new coordinates, you provide enough information to calculate the values in the new columns. You can activate this dialogue using the New Sky Coordinate Columns () button in the Columns Window.

The dialogue window has two halves; on the left you give the existing columns which represent sky coordinates, their coordinate system (fk5, fk4, galactic, supergalactic or ecliptic) and the units (degrees, radians or sexagesimal) that they are in. Note that the columns available for selection will depend on the units you have selected; for degrees or radians only numeric columns will be selectable, while for sexagesimal (dms/hms) units only string columns will be selectable. On the right you make the coordinate system and units selections as before, but enter the names of the new columns in the text fields. Then just hit the OK button, and the new columns will be appended at the right of the table.


A.14 Algebraic Subset Window

Algebraic Subset dialogue window

Algebraic Subset dialogue window

The Algebraic Subset Window allows you to define a new Row Subset which uses an algebraic expression to define which rows are included. The expression must be a boolean one, i.e. its value is either true or false for each row of the table. You can activate this dialogue using the Add Subset () button in the Subsets Window.

The window consists of two fields which must be filled in to define the new subset:

Subset Name
The name of the new subset. This should preferably be unique (different from existing subset names). It will be easier to use it in other expressions if it is also:
Expression
This is a boolean expression which defines the subset; it is a function of the values of any combination of the columns; only rows for which it evaluates to true will be included in the subset. The values of the other columns in the same row are referenced using their names or $ID identifiers, and other subsets may be referenced using their names or _ID identifiers. The rules for expression syntax are described in Section 6, and detailed documentation of the functions you can use can be seen in the Available Functions Window, which you can see by clicking the Show Functions () button on the toolbar.

Having filled in the form to your satisfaction, hit the OK button at the bottom and the new subset will be added to the list that can be seen in the Subsets Window where it behaves like any other. If you have made some mistake in filling in the fields, a popup window will give you a message describing the problem.


A.15 Available Functions Window

Available Functions Window

Available Functions Window

This window displays all the functions (Java methods) which are available for use when writing algebraic expressions. This includes both the built-in expressions and any extended ones you might have added. You can find this window by using the Show Functions () button in the Synthetic Column or Algebraic Subset window toolbars.

On the left hand side of the window is a tree-like representation of the functions you can use. Each item in this tree is one of the following:

Folder
A group of classes. There's only one of these, marked "Activation Functions", and it contains functions which are only available for use in Activation Actions. When defining a new synthetic columns or algebraic subsets they are not used.
Class
A set of functions and/or constants; it doesn't matter what class a function is in when you use it, but since the functions in a class are usually related this makes it easier to find the one you're looking for in this window.
Function
A function that you can use in an expression.
Constant
A constant value which you can refer to by name in an expression (as long as it doesn't clash with a column name).

Of these, the Folder and Class items have a 'handle' (), which means that they contain other items (classes and functions/constants respectively). By clicking on the handle (or equivalently double-clicking on the name) you can toggle whether the item is open (so you can see its contents) or closed (so you can't). So to see the functions in a class, click on its handle and they will be revealed.

You can click on any of these items and information about it will appear in the right hand panel. In the case of functions this describes the function, its arguments, what it does, and how to use it. The explanations should be fairly self-explanatory; for instance the description in the figure above indicates that you could use the invocation atan2(X_POS,Y_POS) as the expression for a new table column which gives the angle from the X axis of a point whose position is given by columns with the names X_POS and Y_POS. Examples of a number of these functions are given in Section 6.7.

Using the Add button () you can specify the name of a class to add to those available. You should enter the fully-qualified class name (i.e. including the dot-separated package path). The class that you specify must be on the class path which was current when TOPCAT was started, as explained in Section 7.2.1. Note however it would be more usual to specify these using the system property jel.classes or jel.classes.activation at startup, as described in Section 6.8. Classes added in this way will be visible in the tree, but may not have proper documentation (clicking on them may not reveal a description in the right hand panel).


A.16 Log Window

Log Window

Log Window

The log window can be obtained using the View Log option on the File menu of the Control Window.

This window displays any log messages which the application has generated. Depending on whether the -verbose flag has been specified, some or all of these messages may have been written to console as well (if there is a console - this depends on how you have invoked TOPCAT). Under some circumstances, messages way back in the list may not be displayed.

To clear the display of all the existing messages you can use the Clear Log button ().

The messages displayed here are those written through Java's logging system - in general they are intended for debugging purposes and not for users to read, but if something unexpected is happening, or if you are filing a bug report, it may provide some clues about what's going on. Although it tries not to disturb things too much, TOPCAT's manipulation of the logging infrastructure affects how it is set up, so if you have customised your logging setup using, e.g., the java.util.logging.config.* system properties, you may find that it's not behaving exactly as you expected. Sorry.


B Release Notes

This is TOPCAT, Tool for OPerations on Catalogues And Tables. It is a general purpose viewer and editor for astronomical tabular data developed within the UK Starlink project.

Author
Mark Taylor (Starlink, Bristol University)
Email
m.b.taylor@bristol.ac.uk
WWW
http://www.starlink.ac.uk/topcat/
User comments, suggestions, requests and bug reports to the above address are welcomed.

Related software products are

STIL
The Starlink Tables Infrastructure Library, which provides the table handling classes on which TOPCAT is based.
STILTS
The STIL Tool Set, which provides some command-line tools based on STIL. The intention is that this should be a non-graphical counterpart to TOPCAT, providing many of the same facilities (matching, row selection, format conversion etc) but in a form which can be incorporated into scripts, web services, etc. The current release only contains a few of these features however.

The Starlink project under which TOPCAT and friends have been developed has been shut down as of July 2005. At the time of writing, this means that there is currently no provision for continued support and development of the software. The author is currently pursuing possibilities for further funding, but it's not yet clear whether or how these will work out. Probably at least a minimal level of support will continue to be available, one way or another. The email address above may or may not continue to be active; check the TOPCAT web page for news in the future.


B.1 Acknowledgements

Inspiration for many of TOPCAT's features has been taken from the following pre-existing tools:

Apart from the excellent Java 2 Standard Edition itself, the following external libraries provide important parts of TOPCAT's functionality:

The following users, testers and programmers have supplied useful comments (apologies for any missed out):


B.2 Version History

Releases to date have been as follows:

Version 0.3b (4 June 2003)
First public release
Version 0.4b (8 July 2003)
Version 0.4-1b (10 July 2003)
Version 0.5b (20 October 2003)
Version 0.5-1 (18 November 2003)
Version 1.1-0 (21 April 2004)
Version 1.1-3 (5 May 2004)
Version 1.3 (20 October 2004)
This version has introduced many improvements in scalability, efficiency and functionality. TOPCAT is now quite happy with tables of a million rows or more (and hundreds of columns) even on systems with quite modest memory/CPU resources. The main improvements are as follows:
Plotting
  • Plotting is much faster and can handle many more points
  • Subsets can be selected from plot window by tracing out a non-rectangular region
  • You have more choice over plotting symbols (including semi-transparent ones)
  • Finally X or Y axes can be flipped!
  • Export to encapsulated PostScript is of improved quality (though for many points file sizes can get large)
  • Export to GIF format is available
  • Regression lines can be plotted and coefficients displayed (experimental capability - could be improved)
Table Formats
  • "-disk" flag allows use of disk backing storage for large tables
  • New 'FITS-plus' format stores rich table/column metadata in a FITS file
  • VOTable handler now fully VOTable 1.1 and 1.0 compatible
  • VOTable parsing now works with Java 5.0 platform
  • Comma-Separated Value format now supported for input and output
  • ASCII input handler rewritten to cope with much larger tables
  • ASCII handler now understands d/D as exponent letter as well as e/E
  • ASCII handler now uses Short/Float not Integer/Double where appropriate to save memory
  • ASCII format fixed bug for -0 degrees/hours in sexagesimal angles
  • Null handling improved for FITS & VOTable formats
  • FITS files store column descriptions in TCOMMx headers
  • Better error messages for unparsable tables
Table Joins
  • Various efficiency improvements and reductions in memory requirements
  • In cases of multiple possible matches, the closest is now chosen rather than picking one at random
  • Pair match now adds column containing score for each match (distance between points)
  • Units can be selected RA/Dec columns and match errors (so it doesn't need to be all in radians)
  • New match types suitable for multivariate matching (anistropic Cartesian, Sky+X, Sky+XY)
Data/Metadata Manipulation
  • Can add/remove table parameters
  • One-step column replacement dialogue from data or column windows
  • Synthetic column expressions now written out to column descriptions
GUI Navigation and Display
  • Improved rendering of numbers in tables (esp. Floats)
  • Better detection of displayed table column widths
  • New Control Window option on File menus
  • Better window resizing for some dialogue boxes
  • Less confusing error messages in many places
Algebraic Expressions
  • All available functions are now fully documented in help document and interactive Method Window
  • Many new trig, coordinate, type conversion, string manipulation functions
  • Big performance improvements for null values
Activation Actions
  • Clicking a point in the plot highlights the corresponding row in the data window and vice versa
  • Row selection can trigger display sky cutout region display
  • Row selection can trigger user-defined actions on activation

In addition, the following incompatibilities and changes have been introduced since the last version:

Version 1.3-1 (10 November 2004)
Minor changes:
Version 1.3-2 (6 Dec 2004)
Bug fix:
Version 1.4 (4 Feb 2005)
Load Dialogues
The graphical table load dialogue has been overhauled, and now has two main new features. First, it has been rewritten so that the GUI does not freeze during a long load; it is still currently not possible to interact with other TOPCAT windows while a load is taking place, but you can now cancel a load that is in progress.

Secondly, the provision of load dialogues has been modularised, and a number of new dialogues provided. The new ones are:

  • Cone Search
  • MySpace Browser
  • Registry Query
  • SIAP Query
If the required classes are present, you can acquire tables from these external sources as well as the traditional methods of loading from disk etc. New command line flags corresponding to each of these have been added to ensure that they are present and make them prominent in the load dialogue. Furthermore it is possible to plug in additional load dialogues at runtime using the startable.load.dialogs system property.

The appearance of the Load Window has changed; now only the File Browser button is visible along with the Location field in the body of the window, but the DataSources menu can be used to display other available table import dialogues.

Packaging
The program can now be obtained in two standalone forms: topcat-full.jar and topcat-lite.jar. The former is much larger than before (11 Mbyte), since it contains a number of classes to support custom load dialogues such as the MySpace browser and web service interaction, as well as the SoG classes. The latter contains only the classes for the core functionality, and is much smaller (3 Mbyte).
Explode Array Column action
There is now a new button in the Columns Window which replaces an array-valued column with a scalar column for each of its elements.
Paste'n'Load
You can now load a table by pasting its filename or URL as text into the table list in the Control Window (using the X selection on X-windows - not sure if or how this works on other platforms).
Help message
The result of topcat -help is now more comprehensive, describing briefly what each option does and listing system properties as well as arguments/flags proper.
Version 1.4-1 (8 February 2005)
Version 1.5 (17 March 2005)
File Access
Load dialogues have changed again somewhat, and save dialogues as well. The default file browser in both cases is now a Filestore Browser, which is very much like the standard file browser, but can browse files in remote filesystems as well; currently supported are files in AstroGrid's MySpace or on an SRB (Storage Resource Broker) server. You can now save files to these remote locations as well as load from them.

In addition, the save dialogue now displays the current row subset and sort order - this makes it easier to see and/or change the details of the table you're about to save.

BugFixes
A few more minor changes have been made.
  • Error display dialogue boxes have been improved in some places
  • Various bugs relating to JDBC database access have been fixed
  • Some minor issues relating to VOTables with single-character columns have been addressed
Version 1.6 (30 June 2005)
Activation Actions
Some more activation functionality has been added:
  • New View URL as Web Page option introduced in Activation Window
  • New System class of activation functions containing exec functions which execute commands on the local operating system
  • New Browsers class of activation functions for displaying URLs in web browsers (external or basic fallback one)
Algebraic Functions
New Times class added containing functions for converting between Modified Julian Day and ISO 8601 format epochs.
Sky Matching
The default sky matching algorithm now uses HEALPix rather than HTM for assigning sky pixels to RA,Dec positions. This gives much faster sky matches in most cases, and uses somewhat less memory so can be used on larger tables. It has also fixed a bug which missed out some possible matches. HTM-based matching is currently still provided as an option, but this is mainly for debugging purposes and may be withdrawn in the future.
Logging
The message logging has been tidied up. The main observable consequence of this is that fewer untidy messages are written to the console when TOPCAT is run from a standalone jar file rather than a full starjava installation. By specifying the new -verbose (or -v) flag one or more times you can get those messages back. The messages (in fact all logging messages at any level) can also be viewed from the GUI by using the new File|Show Log menu option from the Control Window.
SOAP Services
TOPCAT now acts as a SOAP server; SOAP requests can now be made to a running instance of TOPCAT to get it to display tables by location or by sending XML for a VOTable direct. Because of limitations in Axis, this latter method won't work for arbitrarily large tables.
Documentation changes
The tablecopy tool is no longer covered in this document; it is replaced by the tcopy tool in the separate STILTS package. There has also been some reorganisation of this document, mainly in the appendices.
Minor changes
  • Added -version flag
  • Added (dummy) Print option to Data Window. This just presents a message to the effect that you should save to a printable format.
  • Fixed a bug which gave errors when expressions contained a NULL_ test on the first column of a table.
  • Modified one of the demo tables to contain a column with URLs in it.
Version 1.6-1 (7 July 2005)
Bugfixes:
Version 1.7 (30 September 2005)
Crossmatching
There have been major improvements in the flexibility, and minor improvements to performance, of two-table crossmatching.
  • New match algorithm Sky with Errors introduced. This allows you to specify a column giving the maximum permissible match error (so it can vary per row) rather than a fixed value for the whole table.
  • In the case of multiple possible matches between the two tables, instead of automatically giving you only the closest match, you can now select whether you'd like only the closest one or all those which fit your criteria.
  • You can now specify which rows you want to see in the output: 1 and 2, 1 or 2, All from 1, All from 2, 1 not 2, 2 not 1, 1 xor 2. This is pretty much all the possibilities which make sense, and in particular allows you to do 'left outer joins' (1 not 2).
  • The match score column which results from most matches now comes (a) in sensible units where possible (e.g. arcseconds not radians) and (b) with metadata which tells you what its meaning and units are.
  • More information is available in added columns after the match; as well as the match score, information about matched groups is inserted where appropriate.
  • The "Spherical Polar" match algorithm is now rebadged as the hopefully less confusing "Sky 3d".
Similar changes for 1-table and multi-table matches should follow in future versions.
MySpace Access
MySpace I/O has been re-implemented to use the ACR rather than the (now deprecated) CDK classes it was using before. As well as probably being more reliable and less likely to break with future changes in MySpace server protocols, this gives the benefit of single sign on. The effect of this is that you will need to have the AstroGrid desktop running on your machine before you can access MySpace from TOPCAT.
Algebraic functions
  • Added Julian Epoch and Besselian Epoch conversion functions to Times class.
  • Added RANDOM special function.
Miscellaneous
  • When you select a column in the Columns window, it now scrolls the table in the Data Window so that the selected column is visible. This is a boon when dealing with tables that have very many columns.
  • String "null" interpreted as a blank value in ASCII tables.
  • Added new activation action to launch system default browser.
Bugfixes
  • Fixed some relatively harmless bugs to do with actions available when you select the dummy "Index" column. You can now unsort from a popup menu in the table viewer window.
  • Believed to work fine with Java 1.5 now (there were previously some issues with MySpace at Java 1.5).
  • Fixed bug in ASCII input handler which misidentified blank lines, or DOS-format line ends, as end of file.
Version 1.7-1 (4 October 2005)
Bugfixes:
Version 1.8 (13 October 2005)


TOPCAT - Tool for OPerations on Catalogues And Tables
Starlink User Note 253
TOPCAT web page: http://www.starlink.ac.uk/topcat/
Author email: m.b.taylor@bristol.ac.uk
Starlink: http://www.starlink.ac.uk/