Abstract
At the Technical University of Catalonia (UPC), a smart room has been equipped with 85 microphones and 8 cameras. This paper describes the setup of the sensors, gives an overview of the underlying hardware and software infrastructure and indicates possibilities for high- and low-level multi-modal interaction. An example of usage of the information collected from the distributed sensor network is explained in detail: the system supports a group of students that have to solve a lab assignment related problem.






Similar content being viewed by others
References
Josep R, Casas R, Stiefelhagen et al (2004) Multi-camera/multi-microphone system design for continuous room monitoring, CHIL-WP4-D4.1-V2.1-2004-07-08-CO, CHIL Consortium Deliverable D4.1
Stanford V, Rochet C (2003) The NIST Mk-III microphone array, and applications of adaptive beam forming to speech. In: 5th international workshop on microphone array systems: theory and practice. (http://www.nist.gov/smartspace/cmaiii.html)
Landabaso JL, Xu LO, Pardas M (2004) Robust tracking and object classification towards automated video surveillance. In: Proceedings of the international conference on image analysis and recognition ICIAR 2004, Porto, Portugal, September 29–October 1 2004, Part II, pp 463–470
Landabaso JL, Pardàs M, Xu LQ (2005) Hierarchical representation of scenes using activity information. In: Proceedings of ICASSP, Philadelphia, 18–23 March 2005
Josep R, Casas O, Garcia, et al (2004) Initial multi-sensor selection strategy to get the best camera/microphone at any time, CHIL-WP4-D4.2-V2.0-2004-10-18-CO, CHIL Deliverable D4.2, October
Garcia O, Casas JR (2005) Functionalities for mapping 2D images and 3D world objects in a Multicamera Environment. In: 6th international workshop on image analysis for multimedia interactive services (WIAMIS), Montreux, Switzerland
Laurentini A (1994) The visual hull concept for silhouette-based image understanding. IEEE Trans Pattern Anal Mach Intell 16(2):150–162
Landabaso JL, Pardas M Foreground regions extraction and characterization towards real-time object tracking. In: Proceedings of joint workshop on multimodal interaction and related machine learning algorithms (MLMI ‘05), 2005. 3
Padrell J, Macho D, Nadeu C (2005) Robust speech activity detection using LDA applied to FF parameters. In: Proceedings of ICASSP’05, Philadelphia
Omologo M, Svaizer P (1994) Acoustic event localization using a crosspower-spectrum phase based technique. In: Proceedings of ICASSP’94, Adelaide
Abad A, Macho D, Segura C, Hernando J, Nadeu C (2005) Effect of head orientation on the speaker localization performance in smart-room environment. In: Proceedings of INTERSPEECH–EUROSPEECH 2005, Lisbon
Temko A, Macho D, Nadeu C (2005) Selection of features and combination of classifiers using a fuzzy approach for acoustic event classification. In: Proceedings of the 9th European Conference on speech communication and technology, Interspeech 2005, Lisbon
Temko A, Macho D, Nadeu C (2006) Improving the performance of acoustic event classification by selecting and combining information sources using the fuzzy integral. Lecture notes in computer science (LNCS), vol 3869
Temko A, Nadeu C (2006) Classification of Acoustic events using SVM-based clustering schemes. Pattern recognition. Elsevier, Amsterdam (in press)
NIST smart space system. http://www.nist.gov/smartspace
CHIL Deliverable D2.4 (2006) CHIL software architecture version 2.0
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been partially supported by the European Union, IP 506909 (CHIL)
Rights and permissions
About this article
Cite this article
Neumann, J., Casas, J.R., Macho, D. et al. Integration of audiovisual sensors and technologies in a smart room. Pers Ubiquit Comput 13, 15–23 (2009). https://doi.org/10.1007/s00779-007-0172-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-007-0172-1