Approximation-based feature selection and application for algae population estimation

Shen, Qiang; Jensen, Richard

doi:10.1007/s10489-007-0058-y

Approximation-based feature selection and application for algae population estimation

Published: 06 June 2007

Volume 28, pages 167–181, (2008)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Qiang Shen¹ &
Richard Jensen¹

87 Accesses
Explore all metrics

Abstract

This paper presents a data-driven approach for feature selection to address the common problem of dealing with high-dimensional data. This approach is able to handle the real-valued nature of the domain features, unlike many existing approaches. This is accomplished through the use of fuzzy-rough approximations. The paper demonstrates the effectiveness of this research by proposing an estimator of algae populations, a system that approximates, given certain water characteristics, the size of algae populations. This estimator significantly reduces computer time and space requirements, decreases the cost of obtaining measurements and increases runtime efficiency, making itself more viable economically. By retaining only information required for the estimation task, the system offers higher accuracy than conventional estimators. Finally, the system does not alter the domain semantics, making any distilled knowledge human-readable. The paper describes the problem domain, architecture and operation of the system, and provides and discusses detailed experimentation. The results show that algae estimators using a fuzzy-rough feature selection step produce more accurate predictions of algae populations in general.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Au WH, Chan KCC (1998) An effective algorithm for discovering fuzzy rules in relational databases. In: Proceedings of the 7th IEEE international conference on fuzzy systems, pp 1314–1319
Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Google Scholar
Chan K, Wong A (1990) APACS: a system for automatic analysis and classification of conceptual patterns. Comput Intell 6:119–131
Article Google Scholar
Chan R (1999) Protecting rivers & streams by monitoring chemical concentrations and algae communities. In: ERUDIT: 3rd international competition of data analysis by intelligent techniques (runner up)
Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorisation. Appl Artif Intell 15(9):843–873
Article Google Scholar
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Article Google Scholar
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1–2):155–176
Article MATH MathSciNet Google Scholar
Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice Hall, New York
MATH Google Scholar
Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Slowinski R (ed) Intelligent decision support. Kluwer Academic, Dordrecht, pp 203–232
Google Scholar
Edwards AL (1976) An introduction to linear regression and correlation. Freeman, San Francisco
Google Scholar
ERUDIT, European network for fuzzy logic and uncertainty modeling in information technology (1999) Protecting rivers and streams by monitoring chemical concentrations and algae communities, 3rd international competition
Flury B, Riedwyl H (1988) Multivariate statistics: a practical approach. Prentice Hall, New York
Google Scholar
Hayashi I, Maeda T, Bastian A, Jain LC (1998) Generation of fuzzy decision trees by fuzzy ID3 with adjusting mechanism of AND/OR operators. In: Proceedings of the 7th IEEE international conference on fuzzy systems, pp 681–685
Höhle U (1988) Quotients with respect to similarity relations. Fuzzy Sets Syst 27:31–44
Article MATH Google Scholar
Janikow CZ (1998) Fuzzy decision trees: issues and methods. IEEE Trans Syst Man Cybern Part B: Cybern 28:1–14
Article Google Scholar
Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471
Article Google Scholar
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. In: Proceedings of ninth national conference on artificial intelligence, pp 129–134
Kononenko I (1994) Estimating attributes: analysis and extensions of Relief. In: Proceedings of the European conference on machine learning, pp 171–182
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic, Dordrecht
MATH Google Scholar
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(3):1–12
MATH Google Scholar
Marin-Blázquez JG, Shen Q (2002) From approximative to descriptive fuzzy classifiers. IEEE Trans Fuzzy Syst 10(4):484–497
Article Google Scholar
Pal SK, Skowron A (eds) (1999) Rough-fuzzy hybridization: a new trend in decision making. Springer, Singapore
MATH Google Scholar
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic, Dordrecht
MATH Google Scholar
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C, Smola A (eds) Advances in kernel methods—support vector learning. MIT, Cambridge
Google Scholar
Quinlan JR (1993) C4.5: programs for machine learning. The Morgan Kaufmann series in machine learning. Kaufmann, San Mateo
Google Scholar
Shen Q, Chouchoulas A (2001) FuREAP: a fuzzy-rough estimator of algae population. Artif Intell Eng 15(1):13–24
Article Google Scholar
Shen Q, Jensen R (2004) Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring. Pattern Recognit 37(7):1351–1363
Article MATH Google Scholar
Smola AJ, Schölkopf B (1998) A tutorial on support vector regression. NeuroCOLT2 Technical Report Series NC2-TR-1998-030
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J Roy Stat Soc B 36:111–147
MATH Google Scholar
Wang Y (2000) A new approach to fitting linear models in high dimensional spaces. PhD thesis, Department of Computer Science, University of Waikato
Wang Y, Witten IH (1997) Inducing model trees for continuous classes. In: van Someren M, Widmer G (eds) Proceeding Poster papers: ninth European conference on machine learning, Prague, Czech Republic, pp 128–137
Witten IH, Frank E (2000) Data mining: practical machine learning tools with Java implementations. Kaufmann, San Francisco
Google Scholar
Yao Y, Chen Y (2006) Rough set approximations in formal concept analysis. LNCS Trans Rough Sets 5:285–305
Article MathSciNet Google Scholar
Zadeh LA (1975) The concept of a linguistic variable and its application to approximate reasoning. Inf Sci 8:199–249
Article MathSciNet Google Scholar
Zadeh LA (1975) Inf Sci 301–357
Zadeh LA (1975) Inf Sci 9:43–80
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Wales, Aberystwyth, UK
Qiang Shen & Richard Jensen

Authors

Qiang Shen
View author publications
You can also search for this author inPubMed Google Scholar
Richard Jensen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qiang Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Q., Jensen, R. Approximation-based feature selection and application for algae population estimation. Appl Intell 28, 167–181 (2008). https://doi.org/10.1007/s10489-007-0058-y

Download citation

Received: 07 February 2007
Accepted: 16 April 2007
Published: 06 June 2007
Issue Date: April 2008
DOI: https://doi.org/10.1007/s10489-007-0058-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximation-based feature selection and application for algae population estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient feature selection methods using PSO with fuzzy rough set as fitness function

RS-HeRR: a rough set-based Hebbian rule reduction neuro-fuzzy system

Granular rule-based modeling using the principle of justifiable granularity and boundary erosion clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Approximation-based feature selection and application for algae population estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient feature selection methods using PSO with fuzzy rough set as fitness function

RS-HeRR: a rough set-based Hebbian rule reduction neuro-fuzzy system

Granular rule-based modeling using the principle of justifiable granularity and boundary erosion clustering

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now