Projects Members Publications Software Links
 

Bayesian Analysis for Censored Data

     
  Stochastic models are used to express pathogen density in environmental samples for performing microbial risk assessment with quantitative uncertainty. However, enteric virus density in water often falls below the quantification limit (non-detect) of the analytical methods employed, and it is always difficult to apply stochastic models to a dataset with a substantially high number of non-detects, i.e., left-censored data. We applied a Bayesian model that is able to model both the detected data (detects) and non-detects to simulated left-censored datasets of enteric virus density in wastewater. One hundred paired datasets were generated for each of the 39 combinations of a sample size and the number of detects, in which three sample sizes (12, 24, and 48) and the number of detects from 1 to 12, 24 and 48 were employed. The simulated observation data were assigned to one of two groups, i.e., detects and non-detects, by setting values on the limit of quantification to obtain the assumed number of detects for creating censored datasets. Then, the Bayesian model was applied to the censored datasets, and the estimated mean and standard deviation were compared to the true values by root mean square deviation. The difference between the true distribution and posterior predictive distribution was evaluated by Kullback–Leibler (KL) divergence, and it was found that the estimation accuracy was strongly affected by the number of detects. It is difficult to describe universal criteria to decide which level of accuracy is enough, but eight or more detects are required to accurately estimate the posterior predictive distributions when the sample size is 12, 24, or 48. The posterior predictive distribution of virus removal efficiency with a wastewater treatment unit process was obtained as the log ratio posterior distributions between the posterior predictive distributions of enteric viruses in untreated wastewater and treated wastewater. The KL divergence between the true distribution and posterior predictive distribution of virus removal efficiency also depends on the number of detects, and eight or more detects in a dataset of treated wastewater are required for its accurate estimation.
 
     
   
 

This study deals with censored data. In this figure, 12 data points are generated but only five data points, plotted with red circles, are over the detection limit, meaning that the values of seven data points, plotted with blue circles, cannot be exploited to estimate the underlying true distribution depicted by red curves. The black curve is the probabilistic distribution estimated by the developed method.

 
     
  References  
  Tsuyoshi Kato, Takayuki Miura, Satoshi Okabe and Daisuke Sano, "Bayesian modeling of enteric virus density in wastewater using left-censored data"Food and Environmental Virology, vol.5, no 4, 185-193, 2013, DOI 10.1007/s12560-013-9125-1.  
  Tsuyoshi Kato, Ayano Kobayashi, Toshihiro Ito, Takayuki Miura, Satoshi Ishii, Satoshi Okabe, Daisuke Sano, Estimation of concentration ratio of indicator to pathogen-related gene in environmental water based on left-censored data, Journal of Water and Health, accepted.  
  Toshihiro Ito, Tsuyoshi Kato, Kenta Takagishi, Satoshi Okabe, and Daisuke Sano, Bayesian modeling of virus removal efficiency in wastewater treatment processes, Water Science and Technology, 2015;72(10):1789-95. doi: 10.2166/wst.2015.402.  
  Toshihiro Ito, Tsuyoshi Kato, Makoto Hasegawa, Hiroyuki Katayama,
Satoshi Ishii, Satoshi Okabe, Daisuke Sano: Evaluation of virus reduction efficiency in wastewater treatment unit processes as a credit value in the multiple-barrier system for wastewater reclamation and reuse, Journal of Water and Health, , vol 14, no. 5, pp.879--889, doi: 10.2166/wh.2016.096.
 
  Naohiro Kobayashi, Mamoru Oshiki, Toshihiro Ito, Takahiro Segawa, Masashi Hatamoto, Tsuyoshi Kato, Takashi Yamaguchi, Kengo Kubota, Masanobu Takahashi, Akinori Iguchi, Tadashi Tagawa, Tsutomu Okubo, Shigeki Uemura, Hideki Harada, Toshiki Motoyama, Nobuo Araki, Daisuke Sano, Removal of human pathogenic viruses in a down-flow hanging sponge (DHS) reactor treating municipal wastewater and health risks associated with utilization of the effluent for agricultural irrigation, Water Research, , IF=5.991 currently, accepted.  
  Toshihiro Ito, Masaaki Kitajima, Tsuyoshi Kato,Satoshi Ishii, Takahiro Segawa, Satoshi Okabe, Daisuke Sano, Dr. Daisuke Sano. Target virus log10 reduction values determined for two reclaimed wastewater irrigation scenarios in Japan based on tolerable annual disease burden. Water Research. accepted. IF=5.991 currently.  
  Miguel Varela, Imen Ouardani, Tsuyoshi Kato, Syunsuke Kadoya, Mahjoub Aouni, Daisuke Sano, and Jesús Romalde, Sapovirus in wastewater treatment plants in Tunisia: Prevalence, removal, and genetic characterization, Applied and Environmental Microbiology, accepted. IF=3.807 currently.  
     
     
 
Japanese
 
CNN initialization
Tobit analysis
Sign-constrained learning
Top-k SVM
Convolutional Neural Network
Covariance Descriptor
Mahalanobis Encodings
Mean Polynomial Kernel
Microscopic Image Analysis
Censored Data Analysis
Metric Learning
Fuzzy Subspace Clustering
Ligand Prediction
Enzyme Active-Site Search
Transfer learning for Link prediction
Multi-task learning
Label propagation
Microarray data kernels
Drug response prediction
Network inference
Kernel inference
Variational rigid-body alignment
Misc