While 454-based pyrosequencing has led to great advances, an intrinsic artifact of the process leads to artificial over-representation of more than 10% of the original DNA sequencing templates. This is particularly problematic in metagenomic studies, where the abundance of any sequence in a dataset is often used for comparative community analysis. It’s important to remove these artificial replicates before analysis. This phenomenon can skew data interpretation when making comparisons between datasets. As metagenome datasets become more plentiful, the ability to apply more robust statistical tests becomes increasingly important, and the validity of the input datasets becomes more crucial. Tools such as MG-RAST (covered in the January issue of Cold Spring Harbor Protocols in Using the Metagenomics RAST Server (MG-RAST) for Analyzing Shotgun Metagenomes) have the capability to remove exact duplicates, but this captures only a subset of the artificial replicates. In the April issue of Cold Spring Harbor Protocols, Tracy Teal and Thomas Schmidt from Michigan State University present an instruction set for Identifying and Removing Artificial Replicates from 454 Pyrosequencing Data. Their 454 Replicate Filter is a web-based tool that incorporates the algorithm cd-hit. This protocol provides details on how to use the replicate filter and obtain a file of unique sequences for use in metagenomic or transcriptomic analyses. This allows users to obtain a more accurate quantitative representation of the sequence diversity in a dataset.