A New decision Tree Induction Using Composite Splitting Criterion
|Published in:||Issue 3, (Vol. 4) / 2010|
|Author(s):||MAHMOOD Ali Mirza, KUPPA Mrithyumjaya Rao , REDDI Kiran Kumar|
|Abstract.||C4.5 algorithm is the most widely used algorithm in the decision trees so far and obviously the most popular heuristic function is gain ratio. This heuristic function has a serious disadvantage – towards dealing with irrelevant featured data sources. The hill climbing is a machine learning technique used in searching. It has good searching mechanism. Considering the relationship between hill climbing and greedy searching, it can be used as the heuristic function of decision tree, in order to overcome the disadvantage of gain ratio. This paper proposes a composite splitting criterion equal to a greedy hill climbing approach and gain ratio. The experimental results shown that the proposed new heuristic function can Scale up accuracy, especially when processing high dimension datasets.|
|Keywords:||Decision Trees, Gain Ratio, Composite Splitting Criterion, Hill Climbing.|
ď»ż1. Ali Mirza Mahmood, K.Mrutunjaya Rao, Kiran Kumar Reddi (2010) A Novel Algorithm for Scaling up the Accuracy of Decision Trees, (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 02, 2010, Page no:126-131. http://www.enggjournals.com/ijcse/issue.html?issue=201002 02.
2. Chen Jin, Luo De-lin, Mu Fen-xiang,(2009) An Improved ID3 Decision Tree Algorithm, Proceedings of 2009 4th International Conference on Computer Science & Education
3. Smith Tsang, Ben Kao, Kevin Y. Yip, Wai-Shing Ho, Sau Dan Lee(2009) Decision Trees for Uncertain Data, IEEE International Conference on Data Engineering.
4. Rongye Liu, Ning Yang,Xiangqian Ding, Lintao \ Ma(2009), An Unsupervised Feature Selection Algorithm: Laplacian Score Combined with Distance-based Entropy Measure, 2009 Third International Symposium on Intelligent Information Technology Application.
5. Caruana, Rich and Freitag, Dayne, (1994)Greedy Attribute Selection, The Proceedings of the 11th International Conference on Machine Learning,1994.
6. Pat Langley & Stephanie Sage (1994) Pruning Irrelevant Features from Oblivious Decision Trees.AAAI Technical Report Fs-94-02.
7. J.R. Quinlan, â€śInduction of Decision Trees,â€ť MachineLearning, 1, pp. 81-106, 1986.
8. Aha, D. W., & Bankert, R. L. (1994). Feature Selection for case-based classification of cloud types. Working Notes of the AAAI9~ Workshop on Case-Based Reasoning (pp. 106-112). Seattle, WA: AAAI Press.
9. H. Almuallim and T.G. Dietterich, â€śLearning With Many Irrelevant Features,â€ť AAAI-91 proceedings, 9th National Conference on Artificial Intelligence, 1991.
10. K. Kira and L. Rendell, â€śThe Feature Selection Problem:Traditional Methods and a New Algorithm,â€ťAAAI-92 proceedings, 10th International Conferenceon Artificial Intelligence,, 1992.
11. Peak, J. E., & Tag, P. M. (1992). Towards automated interpretation of satellite imagery for navy shipboard applications. Bulletin of the American Meteorological Society,73, 995-1008.
12. oak, J. (1992). An evaluation of feature selection methods and their application to computer security (Technical Report CSE-92-18). Davis, CA: University of California, Department of Computer Science.
13. A. Dhagat and L.Hellerstein (1994), PAC learning with irrelevant attributes, Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Pag :64- 74
14. Merz C, Murphy P, Aha D (1997) UCI repository of machine learning databases. In: Department of ICS, University of California, Irvine (1997). http://www.ics.uci.edu/mlearn /MLRepository.html.
|Back to the journal content|
This article is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.