In biomedical data text mining, normalization of cell line names is of great value. This is important for identification of synthetically lethal genes from literature. So far, many tools have been developed to identify cell lines from literature but it is unclear whether these tools can help to search cancer literature to recognize synthetic lethal genes. In the present study Kaewphan et al (2016) have exhaustively looked for the methods which can help in cell line recognition.They have identified a tagger which is not linked to a specific subdomain. They have obtained two text collections for cell line recognition. These are corpus Gellus and CLL.
They have also calculated the F score of 88.46% for Gellus and 85.98% on CLL. For more interesting results on their text mining you can visit the following website http://turkunlp.github.io/Cell-line-recognition/
Reference:
Kaewphan S et al. (2016) Cell line recognition in support of the identification of synthetic lethality in cancer from text. Bioinformatics
32(2):276-282