The University of Florida Sparse Matrix Collection
Tim Davis, University of Florida,
http://www.cise.ufl.edu/~davis
1. Overview
UFgui is a Java GUI interface to the University of Florida Sparse Matrix
Collection at http://www.cise.ufl.edu/research/sparse/matrices. It provides an
easy way to select matrices from the collection based on their matrix
properties. As of October 2010, there were 2328 matrices ranging in size
from 1-by-2 (with a single nonzero entry) to a square matrix with nearly 28
million rows and 760 million entries. The total size of the collection in all
three formats exceeds 47 GB, and is constantly growing in size.
2. Quick Start
First, download the UFget archive (tar.gz or zip file), and uncompress it.
Open the UFget folder.
If your web browswer requires an HTTP proxy, or if you wish to change
the download location, first see the Customization section below,
to configure UFgui before you run it.
To run UFgui, double-click its icon or type the following command in your
command window / terminal:
java -jar UFgui.jar
If the above command fails, then you need to install Java.
See
http://www.java.com/en/download/manual.jsp for details.
You can skip reading this document by simply navigating the GUI itself. Most
of the buttons, check boxes, lists, and table columns have a short "tool tip"
which is visible if you hover your mouse over the item.
When the UFgui starts, it checks for any missing matrix icons and
downloads a new table if needed. This might take a few minutes, so
just sit back and enjoy the slideshow. It will only happen occassionaly.
3. A Sparse Matrix Problem
The UF Sparse Matrix Collection is a simplified name. It is actually a
collection
of sparse matrix problems, not just sparse matrices. Each problem
includes one primary sparse matrix and meta-data about that matrix. Some
problems include additional matrices and vectors (sparse or dense) such as
right-hand-sides to a linear system Ax=b, or cost constraints for a
linear programming problem. As a short-hand, a "problem" in the collection is
often called simply a "matrix", refering to the primary sparse matrix in the
problem (A, below).
The following data is always present in any sparse matrix problem
(not all fields are shown in the table in the UFgui program, however):
- name: the full name of the problem, in the form
Group/Name, where Group indicates the source of the
problem (a person, organization, or other collection), and Name
is the name of the problem within that Group. As of September
2009, there were 153 matrix Groups.
- title: a short descriptive title.
- A: the primary sparse matrix itself.
It is the properties of this matrix that determine
the selection criteria.
- id: a unique serial number ranging from 1 to the number of
matrices in the collection. As new matrices are added to the
collection they are given higher id's, so new matrices are always
at the end of the table when sorted by id.
- date: the date the matrix was created, or the date it was
added to the UF Sparse Matrix Collection if the creation date is not
known.
- author: the person or persons who created the matrix.
- ed: the person or persons to acquired the matrix from the
author and placed it in a widely accessible collection (this
one or others such as the Rutherford/Boeing collection or
the Matrix Market collection).
- kind: the domain of the problem, such as "circuit simulation",
"chemical process simulation", "finite-element problem", and so on.
As of September 2009, there were 36 different matrix kinds.
The following data is present in some problems but not all:
- Zeros: a binary matrix holding the pattern of
entries explicitly provided by the matrix author which are equal to
zero. These may represent entries that may become nonzero later on in
a simulation, but they are equal to zero in this instance of the
matrix. Since they were provided by the matrix author(s), they are
preserved in this collection. In the MATLAB format, this matrix is
held as a different sparse matrix, since MATLAB removes explicit zeros
from its sparse matrices. In the Matrix Market and Rutherford/Boeing
collection, these entries are included in the primary sparse matrix
itself, as explicitly zero entries.
- b: the right-hand-side to a linear system or least-squares
problem. This can be a vector or matrix, real or complex, and
sparse or dense.
- x: the solution to a linear system or least-squares problem.
- notes: text, with a discussion about the problem, in no
particular format.
- aux: Any number of matrices, vectors, or text. For example, an
optimization problem may include a cost vector c, and vectors
that specify upper/lower bounds on the solution. Details of how to
interpret this auxiliary data are normally given in the notes
field. In the MATLAB format this is a struct containing each of the
items. In the Matrix Market and Rutherford/Boeing formats, this data
resides alongsize the primary matrix in a single compressed folder.
4. Matrix Formats
Each matrix in the collection appears in three different formats. The data
within each format is identical, down to the very last bit (if you find a
discrepency between the same problem in different formats, please let me know).
- MATLAB™ *.mat file. This can be loaded into MATLAB directly,
or it can be accessed via the UFget.m MATLAB interface to the collection.
Type help UFget in MATLAB for more details. The name of the MATLAB
file is of the form mat/Group/Name.mat (such as mat/HB/west0067.mat for the
west0067 matrix from the HB, or Harwell/Boeing, group).
- Matrix Market *.mtx file and associated files. Short meta-data
(name, title, id, date, author, ed, kind, and notes) is given as structured
comments in the primary *.mtx file. This file and any associated files (b,
x, and any aux matrices) are in a single folder which is then archived into
a *.tar.gz file. Windows users will need a 3rd-party program for handling
*.tar.gz files. For example, the Matrix Market format for the HB/west0067
matrix is held in the MM/HB/west0067.tar.gz file. See
http://math.nist.gov/MatrixMarket/index.html for more information about the
Matrix Market format.
- Rutherford/Boeing *.rb file and associated files. Short meta-data
(name, title, id, date, author, ed, kind, and notes) is given as structured
comments in a separate text file. The matrix file, the informational text
file, and any associated files (b, x, and any aux matrices) are in a single
folder which is then archived into a *.tar.gz file. For example, the
Rutherford/Boeing format for the HB/west0067 file is held in the
RB/HB/west0067.tar.gz file. See http://www.cse.scitech.ac.uk/nag/hb/ for
more information about the Rutherford/Boeing format.
5. Selecting Matrices
In the UFgui you are presented with four primary panels.
Selection Criteria panel:
This is used for rule-based selections of matrix subsets. A matrix must
satisfy all properties in this panel to be selected. With the default settings
(available when the UFgui starts or after clicking Reset criteria) all
matrices fit the criteria. However, no matrix is selected until you press the
Select button or select them individually in the table by clicking the
checkbox in the select column.
The selection criteria are based on the matrices properties described in
the matrix table (described below). To select matrices from one or more groups
and/or "kinds," click on the choices in the "group" and/or "kind" lists.
Shift-click the list to select a range of groups or kinds, and control-click
the list to add a single item to your highlighted choices. To clear your
choices, click the Reset criteria button, described below.
When you select/deselect matrices, the boxes to the left of each matrix in
the Table of Matrices get checked/unchecked. You still see all the matrices
in the list, because you can then modify your selection by checking/unchecking
matrices one at a time in the Table itself.
At the bottom of the Selection Criteria panel is a row of buttons:
- Select: Click this to add matrices to your selection
that fit the criteria. No matrix is removed from any prior selection.
For example, to select all square matrices with fewer than 1000 rows,
plus all complex matrices, do the following. First enter 1000 in the
top-right text field, click the square radio button, and then
click Select. Then click Reset criteria. Next, click
complex and then Select.
- Deselect: Click this to remove matrices from your selection
that fit the criteria. In general, if you want matrices that meet some
criteria (a) but not (b), then set the controls for (a) and click
Select. Next, click Reset criteria, set the criteria
(b), and click Deselect. For example, to select all square
matrices with fewer than 1000 rows that are not complex, you could just
do this with a single click of Select (click square,
real, and enter 1000 in the top-right text field, then click
Select). Alternatively, you could click square and enter
1000 as the upper bound on the number of rows and click Select
then Reset criteria, select complex, and click
Deselect.
- Reset criteria: This has no effect on your selection. It
simply resets the criteria to the default (all matrices). Thus, to
select all matrices, click Reset criteria and then
Select. To deselect all matrices, click Reset criteria
and then Deselect.
- Clear selections: This has no effect on your selection
criteria. It simply clears all selections, deselecting all matrices by
unchecking the select column in the table. This is useful if
you have a complex selection criteria and don't want to lose it by
clicking Reset criteria, but you wish to clear all your
selections anyway.
- Help: this button brings up the text you're reading.
Matrix icon panel:
To the right of the Selection Criteria panel is an icon of the most recent
matrix highlighted, downloaded, or in the process of being downloaded.
Table of matrices:
This is a list of all the matrices in the collection. You can click on any
column header to sort the table by that column. Clicking a row (to the right
of the select column) will highlight that row. You can highlight a range of
rows by shift-clicking another row. Control-click will add individual rows.
Next, right-clicking in the table will pull up a pop-up menu allowing you to
select or deselect the highlighted matrices and to export your selection to a
file. You can export a list of your selected matrices to a comma-separarted
file (*.csv) or to a MATLAB (*.m) file. When you highlight a matrix, its icon
is displayed.
The table contains the following columns (you can also hover your mouse
over each column header for a short description):
- select: whether or not you have selected the matrix.
You can click on the check boxes in this column to modify your
selection on a matrix-by-matrix basis. This is the only column in
the table that you can edit.
- mat: this box is checked if you have downloaded the matrix
in its MATLAB format. The location of the HB/west0067.mat file
will be UFget/mat/HB/west0067.mat, for example.
- MM: this is checked if you have already downloaded
the matrix in its Matrix Market format.
- RB: this is checked if you have already downloaded
the matrix in its Rutherford/Boeing format.
- id: the matrix id, in the range 1 to the number of
matrices in the collection. When the UFgui starts,
the table is sorted in this order.
- Group: the group the matrix belongs to.
- Name: the (short) name of the matrix. The full name of
a matrix is Group/Name.
- # rows: the number of rows of the matrix.
- # cols: the number of columns of the matrix.
- # nonzeros: the number of nonzeros in the matrix.
- real: this box is checked if the matrix is real.
It is complex otherwise.
Note that real matrices include any matrix that is not
complex. In particular, integer and binary matrices are
considered real in this search criterion.
- binary: this box is checked if the matrix is binary.
It is non-binary otherwise (there are no binary complex matrices,
so any matrix that is binary is also marked as real).
- 2D/3D: this box is checked if the matrix comes
from a discretization of a 2D or 3D geometry.
- posdef: this box is checked if the matrix is
symmetric and positive definite.
- psym: the symmetry of the nonzero pattern of the
matrix A (including the binary Zeros matrix as well). Let
S = pattern(pattern(A)+Zeros) where pattern(A) is a
binary matrix with entries equal to 1 where A(i,j) is
nonzero, and zero elsewhere. The psym metric is zero if the
pattern of S is completely unsymmetric, and 1.0 if the
pattern is exactly symmetric. Let pmatched be the number of
off-diagonal entries S(i,j) for which S(i,j)=S(j,i)
and let pnzoffdiag be the total number off-diagonal entries
in S. Then psym is the ratio
pmatched/pnzoffdiag.
- nsym: the symmetry of the numerical values of the
matrix. This property ignores the Zeros matrix. It is
equal to 0 for a perfectly unsymmetric matrix, and 1.0 for a
symmetric matrix. Let matched be the number of nonzero
off-diagonal entries A(i,j) for which A(i,j)=A(j,i)
and let nzoffdiag be the total number off-diagonal nonzero
entries in A. Then psym is the ratio
matched/nzoffdiag.
- kind: the problem domain. Note that this is typically
related to the problem group, since many matrix authors submit
matrices to the collection that arise from a single problem domain.
Some group have problems from many domains, however.
Download panel:
This panel controls the downloading of matrices, with
three check boxes, three buttons, and informational items:
- Download: click this to begin the download of the
selected matrices. Matrices that are already downloaded are skipped
(thus, if one of your matrix files happens to get corrupted, simply
remove the file and restart the UFgui). Matrices are downloaded into
the mat, MM, and RB folders inside the UFget
folder, in the same order as they appear in the table view. For
example, if you click on the # nonzeros column heading prior to
clicking Download, your selected matrices will be downloaded in
order of increasing number of nonzero entries. If you hover your mouse
over the download button, a tool tip will tell you what it will do.
- MATLAB (mat) format: click this to download the selected
matrices in MATLAB *.mat format.
- Matrix Market (MM) format: click this to download the selected
matrices in Matrix Market format.
- Rutherford/Boeing (RB) format: click this to download the
selected matrices in Rutherford/Boeing format.
- Matrices selected: gives a running total of the number of
matrices selected in the table.
- Cancel: click to cancel the current download, deleting the
matrix currently being downloaded. Matrices already fully downloaded
are never deleted by the UFgui program.
- Overall progress: this progress bar advances after each
matrix is downloaded.
- Current file: this progress bar advances as the current
matrix is downloaded. The icon and name of the matrix currently
being downloaded is shown in the icon panel.
The contents of the mat, matrices, MM, and RB
folders ("directories" for Unix users), and the UFstats.txt file itself,
maybe be deleted at will when the program is not running. They will be
recreated when the program restarts. If you delete the matrices
directory, for example, matrix icons are redownloaded, which makes for a
fun slide show. Sit back and watch.
6. Customization
The UFsettings.txt includes six lines of text, each with parameters
that you can modify. If this file is missing, or shorter than 6 lines
long, then defaults are used.
- The first line is the default folder containing the mat,
matrices, MM, and RB directories. It is displayed
on the GUI just above the table of matrices. The line is blank
by default. If left blank, these four folders are to be found in the
current working directory. You can modify this first line to refer to
another folder. For example, if I were to create my own folder called
MyMatrices in my home directory, I would use
/home/davis/MyMatrices/ in Unix/Linux,
/Users/davis/MyMatrices/ in Mac OS X, or C:\Documents and
Settings\davis\My Documents\MyMatrices\ in Windows. The trailing
file-separator is optional. Both the slash (\) and backslash (/)
characters are interpretted as your system's file-separator ('\' for
Windows, '/' for everything else).
- The second line is the root URL of the UF Sparse Matrix Collection,
http://www.cise.ufl.edu/research/sparse. This only needs to change in the
unlikely event that the entire collection moves to a new URL.
- The third line is the number of days after which a new list of matrices (in
matrices/UFstats.csv and mat/UF_Index.mat) should be
downloaded. By default this is set to 30. A value of inf means
that this UFstats.csv file is never downloaded automatically. If
the matrices/UFstats.csv is missing or corrupted when UFgui starts,
it will download fresh copies of both files. Thus, to force a refresh,
simply delete the matrices/UFstats.csv file and then start UFgui.
You may also download the files with the MATLAB command
UFget('refresh') prior to running UFgui. The UFsettings.txt
is also used by the UFget MATLAB interface.
- The fourth line gives the name of your HTTP proxy server, if you connect to
the internet via a proxy. If left blank, no proxy is used.
- The fifth line is the port number for your HTTP proxy. If blank,
the default of 80 is used.
- The sixth and final line controls debug diagnostics. If this line is
"debug" (without the quotes), then diagnostics are printed on the standard
output (System.out). You should also start the UFgui.jar via the
command line, otherwise the diagnostic output is not visible.
7. Limitations and known issues
- When a download is complete, the table order jitters slightly. This is
because the mat, MM, and RB columns are updated. The
table sort order is temporarily modified, even if you have not currently
selected one of those columns to sort. It then returns to the proper order
immediately. This appears to be a limitation of the Java Swing library in
JDK 6.
- When cancelling a download with the Cancel button, or terminating the
program normally (by clicking the [x] icon to close the UFgui window), any
partial file currently being downloaded is safely deleted. These files
have the term _IN_PROGRESS appended to their name. If the UFgui program
terminates abnormally and suddenly in the middle of a download, however, it
will leave behind files of this form. You can safely delete any
*_IN_PROGRESS file when the UFgui program is not running.
- Working through an HTTP proxy can be sluggish, and I have even seen
downloads stall completely. I am currently investigating why this
occurs and how to work around a sluggish proxy. If you are using a
proxy and the download stalls, try clicking Cancel and then
Download. If that fails, terminate UFgui. If you terminate it
abnormally (not by clicking the [x] in the window, but with "kill -9"
in Linux, or by forcing it to quit via your OS), you may need to
delete the *_IN_PROGRESS file. Please let me know if this happens.
8. Copyright and License
Copyright (c) 2009-2010, Timothy A. Davis. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the distribution
- Neither the name of the University of Florida nor the names
of its contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
9. Version and Change-Log
- Version 1.0.4, October 27, 2010. Update to this help file only.
- Version 1.0.3, October 9, 2009. Removed the Update button which when
clicked would download a fresh copy of matrices/UFstats.csv. UFgui
updates itself automatically every 30 days. If you want to force a
refresh, just delete that file and restart UFgui. Changed the appearence
of the mat, MM, RB columns. (from Iain Duff's comments).
- Version 1.0.2, October 8, 2009. Added HTTP proxy option
(suggested by Iain Duff), and debug diagnostic option.
- Version 1.0.1, October 7, 2009. Added "Clear selections" button
(suggested by Xiangrui Meng).
- UFgui Version 1.0, October 7, 2009, released.
- When a new version of this software is available, simply move the
mat, matrices, MM, and RB folders from the old
UFget folder into the new one, to preserve the matrices you have
already downloaded. Alternatively, you can edit the UFsettings.txt
file to reflect the folder containing your previous mat,
matrices, MM, and RB folders (see the
Customization section, above).
10. For Further Help
Contact the author of this UFgui program, and the maintainer of the UF Sparse
Matrix Collection: Tim Davis (
http://www.cise.ufl.edu/~davis, email
davis@cise.ufl.edu
or
DrTimothyAldenDavis@gmail.com).
To print this document,
open the file UFhelp.html in your favorite web browser.
11. Acknowledgements
I would like to thank Iain Duff and Xiangrui Meng for their feedback,
which has improved this package. Designing a GUI is an art, and getting
usability feedback from others is vital.