In a post shared on X on Monday, San Francisco-based OpenAI announced a New feature available for its iOS and Android app that enables users to read chat loudly. Itโs also available on the web.
In an interview with me at the end of last week before the announcement, Joan Jang, who led the product to model behaviour in OpenAI told that the company has been thinking about access for some time. He pointed to a feature that released nearly a year ago, which lets people use images as input and ask questions about it, he told me, โWe knew technology would be powerful, but we really didnโt understand all the ways in which it could be used. by the world.
Jang said, OpenAI contacted B My Eyesโ team to get insights and significantly response from people with visionless and low vision. Jang told OpenAI from most feedback as Super Flord and said there were many unexpected comments. They cited cases of use like people take photos of their apparels and ask ChatGPT if it matches, plus taking photos of someoneโs garden and describing them.
Jang said, people were able to notice new things about the environment they didnโt know before. It was enlightening as the details came not from another person but from a more neutral supervisor. I think that only then we learned, โOK, thereโs something about AI here,โ Jang said about the Cosmic Lightbulb moment for OpenAI.
Itโs the fact that itโs not someone who can provide an alternative angle for access โ [chatbot] provides a new, more purposeful perspective. ” He further said: โWeโve been very keen to learn about [about] access from a visible and low-visual community. Weโre not pretending to say, โOur app works perfectly for all accessibility use cases. ‘ We still have to go a long way, but we definitely want to improve this technology to everyoneโs life. Weโll like the reaction. ‘ Stating about his comments, Jang told me that the team has learned a lot through its partnership with B My Eyes.
One of the major learning was that adjusting people of visually and low vision community doesnโt mean only providing visual aid. He said, OpenAI has learned that many people use sound-centric such as screen readers etc. For example, Jang noted that the operating systemโs original text-to-speech engine can sometimes have trouble with a group of objects on the shopping list.
In contrast, users can now take a screenshot of their Amazon Cart and ask ChatGPT about it; Jang said, it was a capacity that didnโt exist before. Shopping should be easier and more accessible from this facility. Depth of the effect of Reed Aloud has excited OpenAI.
Weโre very excited about this technology because weโve been working on it for some time,โ Jang said about Reed Aloud. โI think some things get me excited about this. One so that, before these voice capabilities, a lot of work has been done on writing [and] typing.
But everyone doesnโt think about it.I think through writing, but I know a lot of people around me who think better by talking about it first. I think a lot of people worry when it comes to writing with a chatbot, but it seems more comfortable to talk to chatGPT in an interactive way. The thing Iโm most excited about is that [voice capabilities] opens another way for people to interact with advanced technology, and doing so opens a way for them to communicate and express their ideas better.
Honestly, thatโs one of the things weโre most excited about OpenAI, which is ensuring that advanced AI technologies benefit entire humanity. Iโm particularly excited by the fact that it can meet the needs of people for whom writing hasnโt been their strong side for any reason. ‘ Jangโs associate Mada Aflak, who is the engineer of the ChatGPT team, agreed.
He said in Jangโs concurrent an interview, โSpeaking is a fundamental human skill, [so] itโs also important to enable AI to communicate through voice. โThere are many conversations you can feel fine by writing, but some other conversations look much more natural through voice, such as when you think about it. All these use cases will work much better with voice.
I believe whatever you can by typing, should be able to achieve even through voice command. Building a technology that can make users understand and talk in the natural language by enabling voice commands will help make every digital device more accessible. With voice right now, we can also generate [images] without typing anything. Ultimately, whatever you can type, it should be able to do even by voice.
Also Read: China resolves to โchangeโ the economy, aiming at a stable growth of about 5%