Abstract
Current activity recognition based assistive living solutions have adopted relatively rigid models of inhabitant activities. These solutions have some deficiencies associated with the use of these models. To address this, a goal-oriented solution has been proposed. In a goal-oriented solution, goal models offer a method of flexibly modelling inhabitant activity. The flexibility of these goal models can dynamically produce a large number of varying action plans that may be used to guide inhabitants. In order to provide illustrative, video-based, instruction for these numerous actions plans, a number of video clips would need to be associated with each variation. To address this, rich metadata may be used to automatically match appropriate video clips from a video repository to each specific, dynamically generated, activity plan. This study introduces a mechanism of automatically generating suitable rich metadata representing actions depicted within video clips to facilitate such video matching. This performance of this mechanism was evaluated using eighteen video files; during this evaluation metadata was automatically generated with a high level of accuracy.




Similar content being viewed by others
Notes
Personal IADL Assistant, PIA – EU AAL Funded Research Project (AAL-2012-5-033), available at: http://www.pia-project.org/
Near Field Communication – A short range contactless communication technology
References
De Luca d’Alessandro, E., Bonacci, S., and Giraldi, G., Aging populations: the health and quality of life of the elderly. Clin. Ter. 162:e13, 2011.
United Nations. World Population Ageing 2009 (Population Studies Series). 2010.
Acampora, G., Cook, D. J., Rashidi, P., and Vasilakos, A. V., A survey on ambient intelligence in health care. Proc. IEEE. Inst. Electr. Electron. Eng. 101:2470–2494, 2013.
Chen, L., Hoey, J., Nugent, C. D., Cook, D. J., Yu, Z., Systems, man, and cybernetics, Part C: Applications and reviews, IEEE Transactions on, 42(6):790–808, 2012.
Lapointe, J., Bouchard, B., Bouchard, J., Smart homes for people with Alzheimer’s disease: adapting prompting strategies to the patient’s cognitive profile. In Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments, p. 30, ACM, 2012.
Chan, M., Estève, D., Escriba, C., and Campo, E., A review of smart homes- present state and future challenges. Comput. Methods Programs Biomed. 91:55–81, 2008.
Cook, D. J., and Das, S. K., How smart are our environments? An updated look at the state of the art. Pervasive Mob. Comput. 3:53–73, 2007.
Mihailidis, A., Boger, J. N., Craig, T., and Hoey, J., The COACH prompting system to assist older adults with dementia through handwashing: an efficacy study. BMC Geriatr. 8:28, 2008.
Rafferty, J., Chen, L., Nugent, C., Ontological goal modelling for proactive assistive living in smart environments. Ubiquitous Computing and Ambient Intelligence. Context-Awareness and Context-Driven Interaction. Springer International Publishing, 262–269, 2013.
Filippova, K., Hall, K., Improved video categorization from text metadata and user comments. Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 835–842, ACM, 2011.
Papadopoulos, D. P., Kalogeiton, V. S., Chatzichristofis, S. A., and Papamarkos, N., Automatic summarization and annotation of videos with lack of metadata information. Expert Syst. Appl. 40:5765–5778, 2013.
Ballan, L., Bertini, M., Bimbo, A., Seidenari, L., and Serra, G., Event detection and recognition for semantic annotation of video. Multimed. Tools Appl. 51:279–302, 2010.
McCloskey, S., Davalos, P., Activity detection in the wild using video metadata. In: Pattern Recognition (ICPR), 2012 21st International Conference on pp. 3140–3143, IEEE, 2012.
Perea-Ortega, J. M., Montejo-Ráez, A., Martín-Valdivia, M. T., and Ureña-López, L. A., Semantic tagging of video ASR transcripts using the web as a source of knowledge. Comput. Stand. Interfaces. 35:519–528, 2013.
Metze, F., Ding, D., Younessian, E., and Hauptmann, A., Beyond audio and video retrieval: topic-oriented multimedia summarization. Int. J. Multimed. Inf. Retr. 2:131–144, 2013.
Lawton, M., Brody, E., Instrumental Activities of Daily Living Scale (IADL). 1988.
Rafferty, J., Nugent, C., Chen, L., Qi, J., Dutton, R., Zirk, A., Boye, L. T., Kohn, M., Hellman, R., NFC based provisioning of instructional videos to assist with instrumental activities of daily living. In: Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE, pp. 4131–4134, IEEE, 2014.
Mehla, R., and Aggarwal, R., Automatic Speech Recognition: A survey. International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE). 3(1):45, 2014.
FFMPEG. https://www.ffmpeg.org/.
Google. Google Speech API, http://www.google.com/speech-api/v1/recognize.
Dice, L. R., Measures of the amount of ecologic association between species. Ecology 26:297–302, 1945.
Lee, L., On the effectiveness of the skew divergence for statistical language analysis. AISTATS Artificial Intell. Stat. :65–72, 2001.
Cohen, W. W., Ravikumar, P. D., Fienberg, S. E., A comparison of string distance metrics for name-matching tasks, Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, p. 73–78. 2003.
Chen, W., Ananthakrishnan, S., ASR error detection in a conversational spoken language translation system. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 I.E. International Conference on pp. 7418–7422, IEEE, 2013.
SIL. American English Homophones, http://www-01.sil.org/linguistics/wordlists/english/.
Princeton University: About WordNet., http://wordnet.princeton.edu.
Apache: Lucene, http://lucene.apache.org.
Acknowledgments
This work has been conducted in the context of the EU AAL PIA project (AAL-2012-5-033). The authors gratefully acknowledge the contributions from all members of the PIA consortium.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Transactional Processing Systems
Rights and permissions
About this article
Cite this article
Rafferty, J., Nugent, C., Liu, J. et al. Automatic Metadata Generation Through Analysis of Narration Within Instructional Videos. J Med Syst 39, 94 (2015). https://doi.org/10.1007/s10916-015-0295-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-015-0295-2