From: Raymond Mooney Date: Mon, 12 Mar 90 19:19:58 CST Regarding the data we used. We used a later version of the soybean data sent to us by Bob Stepp which I believe to be the version used by Reinke in his MS thesis on the GEM system (this has 17 diseases as referenced by Spackman, ML88). I also got the version you have (19 diseases, 35 features) from Bob Stepp which I believe is the version used in the original experiments from Michalski & Chilausky (this has missing features which is why we decided to use the 17 disease version in our experiments). The M&C paper references only 15 diseases (35 features) and the first 15 diseases in my 19 disease set match those in the M&C paper. I guess the last 4 diseases just weren't reported in the paper (wonder why??). The existence of the 4 disease clustering set adds even more confusion to any reference to using soybean data. Hope this helps some. -Ray