Exploring Instructional and Access Technologies
Captions
(M11D)
Artificial Representations of Sign Language to Access Information: How Effective are They?
Saduf Naqvi
|
ROUGH EDITED COPY
RIT-NTID ARTIFICIAL REPRESENTATIONS OF SIGN LANGUAGE TO ACCESS INFORMATION: HOW EFFECTIVE ARE THEY? PRESENTER: SADUF NAQVI MONDAY, JUNE 23, 2008 11:00 A.M.
CAPTIONING PROVIDED BY: ALTERNATIVE COMMUNICATION SERVICES, LLC P.O. BOX 278 LOMBARD, IL 60148
* * * * *
This is being provided in a rough-draft format. Communication Access Realtime Translation (CART) is provided in order to facilitate communication accessibility and may not be a totally verbatim record of the proceedings
* * * * *
Artificial Representations of Sign Language to Access Information: How Effective are They? Presenter: Saduf Naqvi
>> I want to welcome you and hope you're enjoying your morning so far. I would like to introduce Saduf Naqvi. I'd like to introduce our interpreters, as well. Marie Bernard and Sara Jacobs. I want to thank them for interpreting this morning. When Saduf is done with her 30-minute presentation. We will open it up for questions. We'll pass out an evaluation because we would like your feedback >> Saduf Naqvi: Thank you. First of all. Thank you very much for coming to my presentation. Me, myself, I'm a research student and I'm presenting some of my work. So my hands are folding very strongly so I don't look too nervous. So we have, I have given out all a copy of my presentation, the paper is available on the RIT website. I've also given you some information for a conference for deaf facts. Which is based in the UK. They are an organization based the technology and deafness. If anybody would be interested to come or find out more about it the information is there for you to have a look at. So also the interpreters thank you very much for being here. If you need me to slow down please let me know. So my research is looked a lot at artificial representation of sign language. In our presentation I will be giving you an introduction to the research, a bit of background, hypothesis, sample -- the hypothesis and methodology toward the research. We will look at the sample population, the materials used and the procedures and results. I wanted to check, is anybody in the room visually impaired? There are some statistics that will come up. I need to know if I need to voice them out for yourself. Anyone? Nope. A lot of numbers. So I will leave them for viewing but I will explain myself. The introduction. There are different digital representations of sign language available. They have been widely used and adopted for a variety of different systems with the main goal of making information accessible to the deaf. Although the systems have been continuously refined and adapted it has been known that they all receive very mixed reviews and some receive low acceptance rates. Does everyone know what I mean by digital representation of sign language? This is a real interpreter, real signer who we have here. A digital representation of this lady would be a video recording of her as one instance so it's not the real interpreter there but it is a recorded version. This has now become a digital format. It's not really been defined this way before. In order to contextualize the research it was done so to make it more clear. Okay. Would the problem with DRSL it highlights two possible issues. Either the contents of digital representation presented to the target audience was inappropriate or the DRSL was unable to transmit the information correctly to the deaf audience. This study will be comparing different digital representations of sign language in order to highlight and understand the different properties and uses of the three types of systems. Sorry, too fast. Okay. Would you like me to go back and repeat? Okay. Sure. There are different types of sign language representations. And some of them are accepted and liked a lot, some of them are getting very mixed reviews where people will reject them and don't want to use them at all. So the aim of this research is to work out what is it that makes the systems more successful than unsuccessful and that's what the whole point of this work was here. It kind of explains two areas in which why a system would or would not be successful. One the way the artificial signer was signing was incorrect or the representation of signing as in the recording of the person was not appropriate for the people who were using the system. So this study will compare these systems against each other. So in order to frame this study some definitions needed to be established as to the uses of these digital representations of sign language. From now on I'll be referring it to as DRSL. That's a bit faster to explain. Firstly when DRSLs are used they are defined -- explain information. How can we define information? For this study we defined it into two areas. One being realtime information which is created on demand and the other being static which is prerecorded and called out when needed. So we need to understand what types of digital representations there are. We have notation systems as you can see the colorful drawings. This is a very well known sign writing system. And then you have video systems which we are quite familiar with. And then you have animation systems where there's an artificially designed person who signs. The documentation of sign language in its movement in a digital form has led to different versions of such systems. Some systems allow and permit users to transform information into a printed form. Such as the notation systems. Well others record signing and are able to retransmit into it a two-dimensional form such as video recordings. You have the flat presentation of someone signing but you can't necessarily move and see the signer from side angles. Although video recordings are less transferable they are still very widely used on television, video and the internet. A third is more recent method is animation where you have a virtual signer and they can three dimensionally sign the information. You can move it into different angles and seeing the signer in different perspectives. A big question in the study is how comparable are the three systems? They look so different. So in order to make them comparable we wanted to look how well do they communicate with the target audience that they are targeted at, the deaf community. So to make them more comparable the following was done. They were using the same -- all three systems represented information in the London dialect of British Sign Language. They use the same dialect. They use the same sentence structure, vocabulary, pace and order. Obviously one system signing something a lot faster and the other one is a lot slower. You won't necessarily work. We wanted to make sure the timing was more appropriate. And the use of the same signs one representing the same sign information. So we make sure that was correctly done and consistently shown. Although the systems are different in their presentation, and they their ability to communicate with the target audience from whom they have been designed for it was important to make sure they were comparable to this level to do the study. So we look at the hypothesis. The type of digital representation of sign language such as the Avatar video annotation system used in different information context such as static and realtime will determine high acceptance rates of the systems and ultimately the efficiency of the method delivered. We aim to check the hypothesis in different information delivery context. That's a bit of a mouthful. I want to check if that's clear to everybody. Is that okay? Fine. Okay. So the sample population used there are a total of 20 participants. As this was thought to be an appropriate sample size for initial study. If the results were shown as insignificant, a larger sample would have been sought. The participants were used who were London signers and were chosen -- the systems that were chosen could demonstrate London sign language system s. The sample was random and there were -- the data collected was used from the questionnaires and support was provided with interpreters. Participants were also filmed in order to make note of any comments made during the study. The materials used, the users were presented with two information category presentations. Like we said static and realtime. So these information systems static which was information that is standard and not often changed. So for a good example like when you have a manual for something that really changed or the same type of information. And then realtime information which is information generated on demand such as news or the weather. The material -- the systems we use would represent static and realtime presentations. The presentations within the categories were, so, for example, let's look at static, in static presentations we had an Avatar that showed a prerecorded sequence and played as an Avatar clip. We then had a video sequence that was prerecorded and shown as a clip. We had a notation prerecorded sequence which is displayed as a notation graphic. Realtime presentations were again Avatar which was played from the software spontaneously as the realtime generation. The video clip. The video clip was a little bit interesting. We had to design this because realtime video generation is not always possible without having a real signer there. So we thought, okay, if you look at the two properties. Going to go off a little bit. So you can understand why this was done. With animation systems one of the biggest properties it has is that it can generate a realtime delivery of a clip but video systems can't necessarily do that because they often get rerecorded and edited and so forth. So to kind of test the two technical approaches, was to have a series of video clips drawn together to make a realtime clip. You literally would have a signer signing a word, the hands were dropped, signing a word, hands were dropped. That would be a realtime example how that could be potentially made. In a notation there was a set of graphics that were also organized to spontaneously generate as a sequence was required. Then in the procedure the following steps were implemented. We had introduction where the participants were recorded one at a time and general information was given. We -- sorry. We had familiarization of the system so the experiment was explained to the participants and information gathering where each participant was asked questions regarding the presentations they saw and then questions and answers at the end of the session so participants were given the opportunity to ask questions. In the results, it was investigated that if each digital representation of sign language could be used -- to represent realtime or static presentations which of the two would rate better and why, also what is the significance of the ratings. They were tested against each other and questions were asked about how the presentations were perceived in terms of the linguistic ability, acceptability, comprehension, usability and likeability. Now, I've got two graphs up of the descriptive statistics. Now in the graph you can see the writings were from one to five. Five being the worst and one being the best. And you can see that when it was presented in static mode, the most preferred system was still video and the lest preferred sometime was notation. Animation rated around the middle. We can also note that it is quite similar to what the realtime presentation did, as well. Because the notation system was still less preferred and the video system was preferred more. We noted that only -- that the only result that shows statistical significance was in the animation digital representation between static and realtime presentations. In the linguistic category we had it broken down to a certain checklist almost to assess the system against linguistic ability of the presentation. And the handshake was noted to be -- to have the statistical significance between realtime and static presentations. It was noted that the overall means was statistically significant and the mean for the static presentation of animation was 3.2 and the mean for the realtime presentation was 3.8. So the static presentation rated better than the realtime presentation of the Avatar. And that's the results. In the digital representations of animation, when starting realtime presentations were showing the following categories had significant correlations. More details can be found in the paper. There's a lot more data but it's too much to put in the presentation. It was observed that although the digital representations of sign language were different in static and realtime mode, they had significantly high correlations. In the digital representation of animation when static and realtime presentations were shown the following categories had significant correlation, likeability, usability, linguistic and morphology, lip patterns, face expression, correct sentence structure, was -- had high correlations between the static and realtime presentation. In the digital representation notation the following correlations were also noted. Likeability, acceptability, comprehension, linguistics and then under linguistics we also noted the handshake, morphology and significance of the arm from the body. Also one of the ratings and lip movement, facial expression and correct BLS structure was also noted to be quite similar. We noted between realtime and static presentations there was a lot of similarities in terms of how they were viewed and weren't necessarily viewed as differently. It's a very big data table but I just explain. We did some sample tests. We did static and realtime tests. These results have been listed in the table. As you can see from the average means, the systems were rated quite poorly on the scale where one was excellent and five was poor. In other words for animation and notation, not only were the static and realtime means similar for both categories but as we're shown in the table but also the participant whose gave higher or lower rate fogs static presentations also tend to give higher or lower ratings for realtime presentations. It gave a generally high correlation. Further analysis done through a series of ANOVA. This table examined inferential statistics for the data found in the study. One way ANOVA was studied was to test the differences. Video animation and notation. In respect of the usability, acceptability, likeability and linguistics. Linguistics was broken down into its respective components, handshake, morphology, distance of the arm from the body, lip movement, facial expression, correct sentence structure, and correct placement and correct signing context. Also to kind of just explain that the choice of linguistic criteria that were defined was done in consultation with teachers of the deaf, people who work with the deaf community but we didn't want to make the linguistic language so difficult for deaf people with low literacy or language development skills. We wanted to make it quite accessible to all the deaf people we were working with so we get a wider perspective than those who are known as more deaf professionals. We wanted a get a more wider perspective of all deaf people in the research. Sorry. From the ANOVA, further tests were conducted between the system to see if there was a difference between the groups. This was conducted on the data found on the same sample group of the 20 participants within the study. And in summary -- no summary, sorry. Going on. That was the -- this was the results from one set of ANOVAs we had. Another set of ANOVAs for the realtime testing. And we also found the similar pattern. So we started to realize more through the analysis that the differences in opinion between realtime and static systems isn't necessarily happening but it was a more difference in which system was preferred over another. So we noted that video systems were still more generally preferred over the more advanced technical systems such as animation or notation. So it started to raise a few questions. For the "T" test. Now the hypothesis states that participants shown the realtime static information systems will show no difference in the perception of digital representations of sign language. Thus the acceptability of the system is relying on its own ability to function in the most technologically optimal solution for the given information context. However the alternative hypothesis states that there is a difference in the perception of the systems when presented in information -- in different information delivery contexts. i.e., static and realtime. For example Avatars are promoted on the ability to present realtime sign language generation for any type of information. Video has limitations in that it needs to have the vocabulary and sentences prerecorded. Therefore, joining various video clips together can appear awkward in its delivery of information. And they do not connect like natural sign language. Avatars are able to connect the vocabulary together so the sentences are more smoother. It can therefore be assumed by practitioners in the field that Avatars pose a more appropriate solution for the realtime generation of information and sign language. In this research we tested the we made -- if we made alternative with video presenting the awkward connection of vocabulary for realtime information delivery how does that compare to the more advanced Avatar. This was also done with notation systems in the static presentation video was recorded and had what would be seen as a most popular solution as it was most widely used. However Avatars provide static presentations, as well. But they have ability to move three dimensionally. So they presented also a realtime solution. And therefore they could be viewed more slower further away or at another angle. So it can be asked how would this compare in the same information delivery context. Systems provided an alternative notation, sorry, provided alternative solution as they were like a written language and could be manipulated much like subtitles or either recorded in advance and presented on demand or presented in realtime. Dependent on the information delivery context, realtime static information delivery. It was found through the data analysis this was not in fact the case. We identified that there was a strong difference in scores between the three systems but that difference was not manifested on the information -- on the effective of the information delivery context. Irrespective of the information delivery context the linguistic criteria determined the systems level of acceptance. This was -- oops. Okay. Another point there was further analysis done through linear regression and it showed that there was a positive relationship between the linguistic criteria and the overall acceptance of a system. And it also -- we also noted that participants continued to make a particular comment when viewing systems that had a lower linguistic ability or they said this lacks emotion for me and I don't understand it. So what is it that deaf people see as emotion and what is it that technologists don't team to understand about the emotion that is missing in the systems. They use an approach to the investigation has assisted in -- how technologists may identify the potential of a system and how they are received. Most essentially it highlighted the finite criteria that is often missed by technologists in thinking through a system design and also highlighted the importance in terms of clear and effective communication of the deaf. The hypothesis tested specifying that the type of information delivered in different digital representations of sign language would affect the systems level of acceptance was not supported from the findings. Instead it was found that it was -- if the system demonstrated particular linguistic of BSL it would then receive higher acceptance from the deaf community irrespective of being in static or realtime mode. The main theme continuously mentioned by participants was emotion. The results show that regardless of technology, the level of emotional representation is the key determiner to the systems level of success and the emotion could be attributed to very particular linguistic criteria to the highest noted ones lip patterns and facial expression. Sign language technology, has been fast developing into a popular area of research with many advances in tools, technology, hardware and software continuously being generated. It has spurred discussion amongst the deaf and hiring community about the potential real word application and many of the systems that have been used and introduced. There have been many claims about the real world application of our systems and the potential has and is being researched. However, there's been skepticism around the deaf and hard-of-hearing community as to whether digital representations of sign languages other than video have any potential to work within the deaf community. In many case their criticism has been well justified. This research is aimed was to understand more about digital representations of sign languages and where the particular DRSL are more suited to particular information delivery context. However, from the findings it was identified irrespective of the information delivery context the DRSL itself held more importance. The central theme being can the DRSL communicate effectively? What was found was that it's not necessarily the system but more so the emotional characteristics also known as the linguistic criteria of the system that are needed for effective communication. It could be said that irrespective of the information system presented if it holds true to the characteristics of sign language, such that it has particular linguistic elements present it will receive a positive end user response. This research has provided an in-depth understanding of the user's perspective of end product that has been proposed for language support for deaf people. Just to acknowledge I would like to thank Anthony Rayburn for his encouragement when I started this research pep is well known in the deaf community. Encouraged me to tackle this issue. I am very thankful for members in the deaf community. Not only did they help me to understand the concerns of the community but they had lengthy conversations with me about the barriers faced by the deaf community. These made me realize the widespread recognition of the distinct needs and the more focused understanding in the modern-day needs of information systems. In particular I would like to thank the deaf community for allowing me to see more of their world, culture and beautiful structure of sign language. It's hoped the research presented here will go in some way to thank you all for your support. [ APPLAUSE ] >> Are there any questions? Would you come up to the microphone if you have a question. >> Could you explain a little bit more about the differences, say, between acceptability and comprehension? >> Saduf Naqvi: We want to get a cage of how the system is seen. It is broken down into the four things. So when we looked at acceptability it was how acceptable is the system as a representation of this information being delivered. Is this a format you wouldn't mind using? It was a range of questions. Wasn't given as one question. It was a range of questions but I didn't have enough space in my presentation. >> You have a range of questions for each of the metrics. >> Saduf Naqvi: Even in the paper I didn't have enough for the study. I would be happy to send that to you. As a general explanation of it for each criteria there was a range of questions that participants could use. They were obviously all jumbled up so you couldn't identify a theme. So when people would answer the questions, they would say, you know, like the presentation, I think this can be used -- it was quite interesting. Some people said I think notation systems can be used but I have to learn the notation first to be able to use it effectively. A lot of people see the potential of animation systems for the future. They said the big flaw of a lot of animation systems they focus a great deal on the hand shapes and the movement of the arms but they lose a lot in the face and it's quite well known that deaf people when communicating look very focusly on the face. Also studies recently where they looked at eye movements of deaf people when communicating and they notice that the main periphery was around the face. So the more intricate signing was around here and the more bolder hand shapes were further out. Quite clear from that if you don't have that information, you very likely going to make a system that probably won't work. But obviously systems have been made and then they had mixed reviews. So there was a split where people were saying to me I like it, but you don't really represent -- the system doesn't represent my language nor what is my culture. And if you don't have that in this system I find it difficult to use and another very big thing that came through was people don't like the fact that, you know, I don't have a person to sign to I want a person to sign to. Why should I sign to a system? Which it's a fair point. I think the worry about this being a wider used thing kept coming through as something that made people quite reluctant toward the systems. >> I want to thank you for this work because among folk whose are hearing there is definitely among the technology people who are hearing yes, the emphasis has been on hand shapes and arm position and what you've got here is a nice empirical study that really does drive home the fact really future efforts should be spent on the face. >> Saduf Naqvi: Thank you. Also to kind of make everyone aware, this study has generated a lot of other work which I would love to have put in here but it was just too much. Also we've kind of designed -- what I worked on was designing a framework which will support the design of systems for the deaf for the future highlighting the fact that facial expression is involved but the biggest thing that's been highlighted is deaf people need to be involved because ultimately what seems to be happen something the system is made and it got -- now tell me what you think of it. And the fact is you've not involved anyone in the design stages so if you don't involve anyone in the design stages, then it's not really their system. How are they going to use it? If you look at video deaf people work with video. They create the -- they understand where the camera angle needs to be. They understand it needs to be focusing near the face and having good lighting. So there's a lot of things that have already been worked out with video which haven't been looked at in animation because it's not been done by deaf people. It's been done by people going, I think it's hand shape and I think it's a bit of arm movement. There we go. We have vocab. This is what really pushed the study forward. Sorry. >> I agree with what your last point about deaf people aren't involved in, for example, animation. I wanted to ask about your participants. Was there a difference between the age of the participants? Did your sample include younger people say under 20? >> Saduf Naqvi: I had participants from 18 and had one gentlemen who was 45. Very broad. But I in particular wanted to involve not just deaf people who understood BSL to a very in-depth understanding. A lot of people who understand BSL as their own natural language but going into the deeper descriptives of linguistics, they don't understand particular things but not deep linguistics and what we wanted to do with this study was to make it accessible to all deaf people. There were other things that came from the study. For example, I will put my hand up for my naiveness when starting to develop this. When I first did the initial research paper, the documentation, and you know I worked with some people to show them and I really simplified it. Got it made into plain English for this will work. Obviously some -- one of my participants said this is stressing me out. Why? The English is too much. But he said I'll be willing to continue with your work because you've done it. It really highlights the fact that he was kind of -- he felt he had to do it but said to him. You have a right to stop the study. But although it paused my work for quite a period of time I learned a great amount about the visual needs for communication with deaf people and of all literacy backgrounds and there was a paper that was written about that which if anybody would like to know, experimental design for the deaf, they can have a copy of that. >> This last part that you were describing about the emotions are important to demonstrate, but what's the definition for emotion? I'm not clear of that. Like, for example, they always say deaf people are emotional and very expressive but the facial expresses are based on grammar and syntax, emotion is separate from that. In your system are you able to separate nonmanual markers from actual emotion? >> Saduf Naqvi: The way in which I looked at it. I didn't look at it as deeply as this to be completely honest with you. What we did look at is what is it that deaf people are defining as emotion? The sign I kept getting was -- right? So which means no facial expression. Yes. >> Most deaf people could equate the two areas of emotion and manual markers but they are very separate and they have different functions and very separate things. You need to look at nonmanual markers. >> Fine. I would be happy to discuss that with you possible. That would be really good. What we did was when we said emotion we said what is it about what areas of emotion? They kept defining it as the face and lip patterns. So it would be very good to have your feedback on that so I can include that into the design of how to make better systems. That would be great, thank you. >> Thank you for your presentation on the DRSL. It's been very interesting. A few years ago I went to the British museum and I found that there were -- there was information available for the deaf and I was wishing that that same kind of system was available in a museum in the United States where the information was more accessible to deaf people. So how does it become available? Who pays for these systems? Is it a government grant that ends up providing these systems? Or who? >> Saduf Naqvi: I don't necessarily know in great detail about the funding and the process to develop these systems. I worked with people who already designed systems that were used in this area and heavily promoted. I wanted to bring them forward for the research. >> Who pays for that? >> Saduf Naqvi: It would be to whoever they apply to funding for. For example with the British library I think they probably got some government funding to support that because British library is a hugely fantastic resource for people to go in and read from. I think that's probably how it was done. But systems that have been designed, the animation system used that was done by the university and the royal national institute for the deaf. They got fund everything that from the EU but I don't actually work with either of these institutes. I worked independently on this work from Goldsmiths College. >> Thank you. >> Are there any more questions? Well, just simply I'm not -- I'm not familiar with a museum's in Europe, really. But do they translate the English into a signed language order or do the sign language representations follow a British-English order? >> Saduf Naqvi: Thank you, very good question. They -- there's been a range of animation systems, everyone knows. Some of them will follow SSE, sign supported English. It would be the translation of word by word of English into signs. This study focused on British Sign Language, British Sign Language structure and we identify systems that could be done in BSL structure, not necessarily in English structure. We wanted to work with a BSL community. But there is still that kind of misunderstanding as to the needs of language needs because what is happening is sometimes a system might be made and it will be represented in SSE and for someone who uses BSL it's fine but it's not acceptable because it's not in the language that's needed for that individual. So that's been highlighted in further work in terms of understanding the language needs of the community because not all deaf people are the same. All deaf people have their own approaches to communication and there's different standards and language styles. >> I have another question about the notation system. Did you find anybody in your study could actually understand that notation system? It isn't taught as far as I know anywhere in Britain. >> Saduf Naqvi: I went through. In the design of the study I really wanted to include notation systems because it's heavily promoted. It also raised the question of who actually understand notations. There was an explanation done with notations beforehand, before the study was done. We went through and said well this color and this movement means this. Go back. The pink, the star sincere someone's eyes, the hand, the movement for technology. And in ASL and British Sign Language technology. So on and so forth so it was explained. Again, if you look, there's no facial expression the next, you know, the next sign is meant to be department. The department like this. So DT is how it was done in the animation. Made it consistent in all of them. Again there's no face there and so on and so forth. So a lot of people, they started to get the idea a little bit after viewing them again and again. >> [ inaudible comment ] >> Saduf Naqvi: Not quickly this is done, they have to look at it and said would you be able to use this? Do you like this? How does it rate? It wasn't shown as a quick flash and shut down. They could keep plague with it and working with it so the systems could be looked at independently for a while I didn't do it so they could see it quickly and then just mock it. Of course no one would remember it. Okay. Last question there. >> Do you have any examples of the signing that you might have used? Like with the Avatar, et cetera? >> Saduf Naqvi: Not on me. >> I would love to see what it looks like. >> Saduf Naqvi: I haven't actually got them on me. >> Shoot. >> Saduf Naqvi: Sorry. If you would like to I can e-mail you the clips. I have clips that I could send over to you. >> All right. Thank you. >> This will end our program today. You can fill out the evaluations that are available on paper or on line. Thank you, Saduf. >> Saduf Naqvi: Thank you.
|