KDD Challenge 2000

Guide to the Bacteriological Examination Data Set

 

 


Background

Detection of Bacteria is very important not only for treatment of infectious diseases, such as pneumonia, but also for prevention of intra-hospital infection.

This database is extracted from a hospital information system in a municipal hospital, which includes information about clinical environments, name of detected bacteria, and characteristics of detected bacteria.

For more information about bacteriology, please refer to:

For more information on antibiotics, please refer to:

For abbreviation symbols of antibiotics, please refer to: antibiotics index:

For Analysis

(1) Please find patterns for the condition whether bacteria can be detected:
As a target concept, please use the attribute "Name of Detected Bacteria".
If a value is included in this attribute, the bacteria was detected (Positive).
If the value is missing, the bacteria was not detected (negative).
(2) Please find patterns showing relationships between the Name of Detected
Bacteria and other attributes. The number of values in this attribute may
too large for analysis and grouping is needed. For further preprocessing
steps, please contact to the donator: tsumoto@computer.org
(3) Please find patterns showing relationships between resistence of antibiotics
and other attributes.

Any other knowledge extraction is also welcome.


Information about Attributes

Sample Size:

20920 records for 1994 Data
22710 records for 1995 Data

Attribute: 162
Categorical Attributes: 1..45,47..66
Numerical Attributes: 46, 67..162
(67-162 are mixtures of inequalities and values,
such as ">16" and "8".)

---------------------------------------------------------------------------------------

(1) Personal Information

1. Sample Number
2. Received Date: date when laboratory starts processing the sample.
3. ID Number
4. Gender
5. Birthday
6. Department
7. Ward No.
8. Doctor Name

(2) Information for Samples

9. Examination Date: data when sample was taken from the patient
10. Sample Location

11.Objective1: the objective for bacterial tests
12.Objective2: the objective for bacterial tests

(3) Status of Patient

13. Fever

(14-18 show which location in a patient a catheter is inserted.
If no values are observed, then a patient does not have any catheter.)
14. Catheter1
15. Catheter2
16. Catheter3
17. Catheter4
18. Catheter5

19. Tracheo
20. Intubation

(21-25 show which location in a patient a drainage is inserted.
If no values are observed, then a patient does not have any drainage.)
21. Drainage1
22. Drainage2
23. Drainage3
24. Drainage4
25. Drainage5

26. WBC: White Blood Cell Count

(4) Treatment to the Patient and Diagnosis

(27-29 show types of treatment other than 30-33.
If no values are observed, then a patient does not have any treatment
other than 30-33.)

27. Medication1
28. Medication2
29. Medication3
30. Steroid
31. Anticancer Drugs
32. Radiation
33. Antiphlogistics

(34-36 show chronic diseases which a patient suffers from.
If no values are observed, then a patient does not suffer from chronic
diseases.)
34. Diagnosis1
35. Diagnosis2
36. Diagnosis3

37. Engineer Name

(5) Laboratory Examinations

38. Urea-WBC
39. Urea-Nitrocide
40. Urea-Protein(Qualitative)
41. Urea-Occultblood
42. Urea-Protein(Quantitative)
43. Total Amount of Bacteria
44. Name of Detected Bacteria
45. biocode
46. Percentage of biocode
47. beta-lactamese
48. VITEK_Card

(6) Sensitivity to Antibiotics


For abbreviation symbols of antibiotics, please refer to:antibiotics index:

49. PCG
50. PCs
51. Aug
52. PCs-Green
53. CEPs-1
54. CEPs-2
55. CEPs-3
56. CEPs-4
57. CEPs-Green
58. AGs
59. MLs
60. TCs
61. LCMs
62. CPs
63. CBPs
64. VCM
65. RFP/FOM

(7) MIC (minimum inhibitory concentration)

66. MIC
67. MPIPC/AMPH
68. ABPC/MCZ
69. AMPC/5FC
70. ASPC/FCZ
71. AMPC/CVA/ITZ
72. ABPC/SBT
73. SBPC
74. CBPC
75. TIPC
76. PIPC
77. CEZ
78. CTM
79. CMD
80. CMZ
81. CFX
82. RFP
83. CMX
84. CZX
85. CTX
86. LMOX
87. FMOX
88. CPR
89. CBPZ
90. CTRX
91. CPZ
92. CPZ/SBT
93. CFPM
94. CFS
95. CAZ
96. AZT
97. CRMN
98. IPM
99. ISP
100.AMK
101.TOB
102.ABK
103.NTL
104.SISO
105.MCR
106.ASTM
107.MINO
108.DOXY
109.CLDM
110.CP
111.FOM
112.VCM
113.LVFX
114.MEPM
115.mpipc/amph
116.abpc/mcz
117.ampc/5fc
118.aspc/fcz
119.ampc/cva/itz
120.abpc/sbt
121.sbpc
122.cbpc
123.tipc
124.pipc
125.cez
126.ctm
127.cmd
128.cmz
129.cfx
130.rfp
131.cmx
132.czx
133.ctx
134.lmox
135.fmox
136.cpr
137.cbpz
138.ctrx
139.CPZ/SBT
140.cpz/sbt
141.cfpm
142.cfs
143.caz
144.azt
145.crmn
146.ipm
147.isp
148.amk
149.tob
150.abk
151.ntl
152.siso
153.mcr
154.astm
155.mino
156.doxy
157.cldm
158.cp
159.fom
160.vcm
161.lvfx
162.mepm


Database

Database consists of a single table.

bact94.csv

 

This database was donated by Dr. Shusaku Tsumoto (Department of Medical Informatics, Shimane Medical University). E-mail: tsumoto@computer.org
For possible questions on the data and task description contact Dr. Tsumoto. All questions and answers asked to Dr. Tsumoto will be published as appendixes to this document.


Asked Questions


Last modified: Fri Feb 4 09:52:52 JST 2000