館藏資源查詢 | 國立臺灣師範大學圖書館

重新查詢顯示MARC 標記書目

檢索點：關鍵字： 限制可取得館藏

排序選項：

上一筆下一筆

作者	Oruganti, Ram Manohar
	ProQuest Information and Learning Co
	Rochester Institute of Technology. Computer Engineering
書名	Image Description using Deep Neural Networks [electronic resource]
出版項	2016


說明	1 on line resource (96 pages)
附註	Source: Masters Abstracts International, Volume: 55-05
	Adviser: Raymond W. Ptucha
	Thesis (M.S.)--Rochester Institute of Technology, 2016
	Includes bibliographical references
	Current research in computer vision and machine learning has demonstrated some great abilities at detecting and recognizing objects in natural images. Current state-of-the-art results in object detection, classification and localization in ImageNet Challenges have the validation accuracy for top 5 predictions for classification to be at 3.08% while similar classification experiments run by trained humans report an accuracy of 5.1%. While some people might argue that human accuracy is a function of training time it can be said with great confidence that automated classification models are at least as good as trained humans in classification problems. The ability of these models to analyze and describe complex images, however, is still an active area of research
	Image description is a good starting point for imparting artificial intelligence to machines by allowing them to analyze and describe complex visual scenes. This thesis work introduces a generic end-to-end trainable Fusion-based Recurrent Multi-Modal (FRMM) architecture to address multi-modal applications. FRMM allows each input modality to be independent in terms of architecture, parameters and length of input sequences. FRMM image description models seamlessly blend convolutional neural network feature descriptors with sequential language data in a recurrent framework. In addition to introducing FRMMs, this work also analyzes the impact of varying activation functions and vocabulary size. For training and testing Flickr8k, Flickr30K and MSCOCO datasets have been used, demonstrating state-of-the-art description results
	Electronic reproduction. Ann Arbor, Mich. : ProQuest, 2017
	Mode of access: World Wide Web
	School code: 0465
主題	Computer engineering
	Computer science
	Electronic books.
	0464
	0984
ISBN/ISSN	9781339829302

QRCode

上一筆下一筆

加入個人書庫回報書目問題轉入EndNote

館藏地	索書號	條碼	處理狀態

Go to Top