作者Barnathan, Michael
Temple University. Computer and Information Science
書名Mining complex high-order datasets [electronic resource]
說明167 p
附註Source: Dissertation Abstracts International, Volume: 71-07, Section: B, page: 4336
Adviser: Vasileios Megalooikonomou
Thesis (Ph.D.)--Temple University, 2010
Selection of an appropriate structure for storage and analysis of complex datasets is a vital but often overlooked decision in the design of data mining and machine learning experiments. Most present techniques impose a matrix structure on the dataset, with rows representing observations and columns representing features. While this assumption is reasonable when features are scalar and do not exhibit co-dependence, the matrix data model becomes inappropriate when dependencies between non-target features must be modeled in parallel, or when features naturally take the form of higher-order multilinear structures. Such datasets particularly abound in functional medical imaging modalities, such as fMRI, where accurate integration of both spatial and temporal information is critical. Although necessary to take full advantage of the high-order structure of these datasets and built on well-studied mathematical tools, tensor analysis methodologies have only recently entered widespread use in the data mining community and remain relatively absent from the literature within the biomedical domain. Furthermore, naive tensor approaches suffer from fundamental efficiency problems which limit their practical use in large-scale high-order mining and do not capture local neighborhoods necessary for accurate spatiotemporal analysis. To address these issues, a comprehensive framework based on wavelet analysis, tensor decomposition, and the WaveCluster algorithm is proposed for addressing the problems of preprocessing, classification, clustering, compression, feature extraction, and latent concept discovery on large-scale high-order datasets, with a particular emphasis on applications in computer-assisted diagnosis. Our framework is evaluated on a 9.3 GB fMRI motor task dataset of both high dimensionality and high order, performing favorably against traditional voxelwise and spectral methods of analysis, discovering latent concepts suggestive of subject handedness, and reducing space and time complexities by up to two orders of magnitude. Novel wavelet and tensor tools are derived in the course of this work, including a novel formulation of an r-dimensional wavelet transform in terms of elementary tensor operations and an enhanced WaveCluster algorithm capable of clustering real-valued as well as binary data. Sparseness-exploiting properties are demonstrated and variations of core algorithms for specialized tasks such as image segmentation are presented
School code: 0225
主題Health Sciences, Radiology
Computer Science
0574
0984
ISBN/ISSN9781124051659
QRCode
相關連結: 連線到 https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3408691 (網址狀態查詢中....)
館藏地 索書號 條碼 處理狀態  

Go to Top