graviti
Products
Resources
About us
PubFig
Classification
Face
|...
License: Custom

Overview

The PubFig dataset is divided into 2 parts:

  1. The Development Set contains
    images of 60 individuals. This dataset should be used when developing your algorithm, so
    as to avoid overfitting on the evaluation set. There is NO overlap between this list and evaluation
    set, nor between this set and the people in the LFW dataset.
  2. The Evaluation Set contains
    images of the remaining 140 individuals. This is the dataset on which you can evaluate
    your algorithm to see how it performs.

Due to copyright issues, we cannot distribute image
files in any format to anyone. Instead, we have made available a list of image URLs where you
can download the images yourself. We realize that this makes it impossible to exactly compare
numbers, as image links will slowly disappear over time, but we have no other option. This
seems to be the way other large web-based databases seem to be evolving. We hope to periodically
update the dataset, removing broken links and adding new ones, allowing for close-to-exact
comparisons.

Data Format

Almost all datafiles follow a "tab-separated values" format. The first two lines are generally
like this:

# PubFig Dataset v1.2 - filename.txt - http://www.cs.columbia.edu/CAVE/databases/pubfig/
#    person    imagenum    url    rect    md5sum

The first line identifies the name and version of the dataset, the filename, and
has a link back to this website. The second line defines the fields in the file, separated
by tabs ('\t'). In this example (similar to the dev_urls.txt
and eval_urls.txt
files), there are 5 fields: person, imagenum, url, rect, and md5sum. The first two are common
to many of the datafiles and are the name of the person and an image index number used to refer
to a specific image of that individual. Note that image numbers are not necessarily sequential
for each person -- there are "holes" in the counting.

Subsequent lines contain one entry per line, with field values also separated by tabs.

Citation

The database is made available only for non-commercial use. If you use this dataset, please
cite the following paper:

@InProceedings{CAVE_0296,
author = {N. Kumar and A. C. Berg and P. N. Belhumeur and S. K. Nayar},
title = {{A}ttribute and {S}imile {C}lassifiers for {F}ace {V}erification},
booktitle = {IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2009}
}

License

Custom

Data Summary
Type
Image,
Amount
58.797K
Size
31.76MB
Provided by
columbia University
Columbia University (also known as Columbia, and officially as Columbia University in the City of New York) is a private Ivy League research university in New York City. Established in 1754 on the grounds of Trinity Church in Manhattan, Columbia is the oldest institution of higher education in New York and the fifth-oldest institution of higher learning in the United States. It is one of nine colonial colleges founded prior to the Declaration of Independence, seven of which belong to the Ivy League.[8] Columbia has been ranked by numerous major education publications as among the top ten universities in the world.
| Amount 58.797K | Size 31.76MB
PubFig
Classification
Face
License: Custom

Overview

The PubFig dataset is divided into 2 parts:

  1. The Development Set contains
    images of 60 individuals. This dataset should be used when developing your algorithm, so
    as to avoid overfitting on the evaluation set. There is NO overlap between this list and evaluation
    set, nor between this set and the people in the LFW dataset.
  2. The Evaluation Set contains
    images of the remaining 140 individuals. This is the dataset on which you can evaluate
    your algorithm to see how it performs.

Due to copyright issues, we cannot distribute image
files in any format to anyone. Instead, we have made available a list of image URLs where you
can download the images yourself. We realize that this makes it impossible to exactly compare
numbers, as image links will slowly disappear over time, but we have no other option. This
seems to be the way other large web-based databases seem to be evolving. We hope to periodically
update the dataset, removing broken links and adding new ones, allowing for close-to-exact
comparisons.

Data Format

Almost all datafiles follow a "tab-separated values" format. The first two lines are generally
like this:

# PubFig Dataset v1.2 - filename.txt - http://www.cs.columbia.edu/CAVE/databases/pubfig/
#    person    imagenum    url    rect    md5sum

The first line identifies the name and version of the dataset, the filename, and
has a link back to this website. The second line defines the fields in the file, separated
by tabs ('\t'). In this example (similar to the dev_urls.txt
and eval_urls.txt
files), there are 5 fields: person, imagenum, url, rect, and md5sum. The first two are common
to many of the datafiles and are the name of the person and an image index number used to refer
to a specific image of that individual. Note that image numbers are not necessarily sequential
for each person -- there are "holes" in the counting.

Subsequent lines contain one entry per line, with field values also separated by tabs.

Citation

The database is made available only for non-commercial use. If you use this dataset, please
cite the following paper:

@InProceedings{CAVE_0296,
author = {N. Kumar and A. C. Berg and P. N. Belhumeur and S. K. Nayar},
title = {{A}ttribute and {S}imile {C}lassifiers for {F}ace {V}erification},
booktitle = {IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2009}
}

License

Custom

0
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti