This is a guest article by conference interpreter Natalia Fedorenkova. She compares the most popular RSI platforms — KUDO, Interprefy, VOICEBOXER, Interactio, SPEAKUS and VERSPEAK as well as an event platform – Zoom based on their cost, function and application. For more information, please, visit her website www.natfedorenkova.ru.
RSI platforms substitute and complement hardware equipment. They can be used for online events and webinars as well as regular on-site conferences with interpreters connecting remotely from the comfort of their homes or specially equipped studios.
Working station of a remote simultaneous interpreter
Fig. 1. Working station of a remote simultaneous interpreter
It is a real photo of me working remotely. As you can see, I have a laptop in front of me and a monitor connected to it. I have an RSI platform launched on one screen and I can look through support materials, glossaries or browse the internet on the other. You can use two separate computers for that purpose, of course, but it is not really necessary: one high-power laptop/desktop covers it all. One screen is not so convenient since in this case you will have to open support documents on top of the RSI platform and lose sight of the video stream and the chat with your partner and moderator: this way you can miss some important information.
What else do you need? A quality headset and microphone. The most essential criterion here is comfort since you will have to wear those for quite a while. Some RSI platforms have certain preferences in terms of headset brand and type – see below. However, the final decision is yours. One more mandatory requirement is high-speed Ethernet cable for internet and even better – two of them. I installed two completely independent Ethernet cables for complete redundancy. If one line is down, the system automatically switches to the second line. One more thing: it is recommended to purchase UPC devices for all units to survive short-time blackouts.
Your working station should be sound-proof as much as possible to provide clear sound in the mic. Often you are asked even to close all the windows to block the noise coming from the street. Many platforms recommend working from a specially equipped studio or hub if there is one in your location.
Criteria for comparing RSI platforms
Natalia identified 11 main criteria for comparing RSI platforms.
An event platform or only RSI platform
Response to requests (communication quality)
Possibility to listen to the floor and your partner simultaneously
Technical requirements for the interpreter’s working station
Mobile application for interpreters.
Let us start from the most expensive platform — KUDO. It is both an event platform and remote simultaneous interpreting platform. Thus, in case of an online event, KUDO does not need any additional external program (like Zoom, Skype, etc.). KUDO supports screen sharing, document uploading, messaging in the chat (for event participants) and polling. It has an interface for participants and for interpreters:
Fig. 2. KUDO interface for event participants
Fig. 3. KUDO interface for interpreters
Brief description of the interpreter’s interface:
Name of meeting
Access to profile
Configure outgoing channel (language)
Outgoing language selection (the language you will be speaking when you activate your microphone)
Configure incoming language
Incoming language selection
Video input selection tabs (alternate between live video or presentation)
View layout toggles (switch between gallery or single view)
Video thumbnails of active speakers
Microphone on/off button (redmicrophone button indicates it is live, a green mic button means it is off)
Channel busy button (When you activate your microphone, this button will light up (blue) to indicate that the channel you are trying to broadcast is already occupied by another interpreter)
Mute (cough) button
Incoming volume control
Floor input selector
Polls tab (access to poll results mentioned by the speaker)
Documents tab (download available documents)
Request list tab (access to the lineup of speakers who have requested the floor)
Shortcut list (display a list of available keyboard shortcuts)
Interface and Media Settings (select the interface language, and mic / speaker / camera to use on the meeting)
As you can see on the last picture, KUDO can have several video streams available at once and you can switch between them. On the right-hand side on the screen there are three incoming and outgoing audio channels displayed – it means that in theory one pair of interpreters can work with three languages. The relay function is available. The most interesting and complex function, in my view, differently programmed in RSI platforms is the handover function. KUDO has the most complicated and multi-stage handover process. There is even a special visual developed for better understanding:
Fig. 4. Handover function in KUDO
Handover can be initiated both by the active (broadcasting at the moment) and passive (resting) interpreter. The active interpreter presses the blue button located to the left from the mic button with handover displayed on it at that moment. After that he or she will see the message in grey request sent. If it is the passive interpreter who initiates the handover, he or she also presses the same blue button with request to switch displayed on it. After the request is sent the blue button message changes into waiting for approval and the countdown starts (15 seconds). This is the time given to the passive interpreter to confirm that he or she is ready to take over and namely to press the I am ready to switch button. After that the active interpreter sees the Go message on the blue button and there is another countdown for 60 seconds: during this time the active interpreter finalizes his or her piece and finds an appropriate moment to hand over to his or her partner. Pressing the Go button switches off the active interpreter’s microphone and your turn message pops up on the passive interpreter’s screen close to the mic button. It means it is time for the passive interpreter to switch on his or her mic and take over. That is how a very elaborated handover process developed by KUDO works. It was designed for situations when partners are located in different places and cannot communicate as usual. Indeed, my RSI experience shows that handover is one of RSI bottlenecks. However, jumping the gun I can say that in my view KUDO and Interprefy overdid this mechanism: the developed procedure is too complicated. Usually my partner and me when working with Interprefy did as follows: we agreed on manual handover without pressing any buttons. When time was right, the passive interpreter signaled to the active one via chat that he or she is ready for handover, then the active interpreter finished his or her piece and in the next pause just switched off his or her mic. The colleague saw it and took over. Otherwise all these buttons and multiple stages of requesting and confirming handover strongly distract you from the interpreting process itself.
One more thing about working in pair on KUDO: here you cannot listen to the speaker and your partner simultaneously – you have to choose only one channel. KUDO representatives are saying that they are now developing this function and already in a couple of weeks it will be fixed. Now many interpreters are using a second device to call each other via a second program or messenger and thus ensure seamless handover.
Let us move on to the next parameter: response to request or communication quality. I have to say that KUDO replies very fast to all the questions, no issues with that. The onboarding process (interpreters’ training and adding them to the data base) is also really well thought through: there is a dedicated training module on the website complete with text descriptions and video guidelines on how to work with KUDO platform. After the interpreter’s journey as KUDO calls it is completed interpreters fill in their profile on the website and can receive orders or work with their own clients using the platform.
KUDO’s technical requirements for the interpreter’s working station:
One (1) APC Smart UPC unit as a power backup for all networking devices.
Google Chrome or Firefox web browsers
Stable, wired connection for all computers in use by interpreters
Upload and download network speed of at least 5 Mbps
Room-wide wi-fi for redundancy
Technical support is provided by the moderator on the RSI platform. There is a KUDO mobile application for participants, but no application for interpreters (for emergency situations).
Moving on to Interprefy. This is also both an event and RSI platform. Online events can be held either on Interprefy platform directly or using external programs: Zoom, Skype, Webex, Microsoft Teams, etc. In the latter case Interprefy connects to these programs. They even made a picture for better understanding of how it works:
Fig. 5. Interprefy’s connection to external programs to service online events
The Interprefy’s interface for interpreters looks like as follows:
Fig.6. Interprefy’s interface for interpreters
It seems simpler than KUDO’s interface, though there are a lot of functions and buttons here as well. On the top panel incoming and outgoing language channels are configured. One of incoming channels is always source (or floor) and the other – the English language (used for relay). The outgoing channels are the languages you are interpreting into. On the right-hand side you can see a microphone sign (red – on, grey – off); a cough button; below – a chat between one virtual booth interpreters and a moderator and an event chat (with all the interpreters and several moderators) . On the left-hand side there are video windows: you can switch between them. In the top right corner there is an already familiar to us handover function. Interprefy also has a very non-transparent handover process: if you press the now button you will see several more pop up windows that active and passive interpreters should press before the handover can take place. In practice, as I already mentioned, we did not use this function with my partners – preferred the manual handover instead.
On Interprefy you can very conveniently listen to both the floor and your partner and even adjust the relative volume of these two channels: you can listen to the speaker with soft background interpreting of your partner (just to make sure the interpreting is taking place) or, on the contrary, increase the volume of your partner’s sound.
Interprefy’s response to requests is quite good: you can receive an answer the same day, often – within an hour.
The Interprefy’s onboarding process is also well tailored. It starts from a Skype interview. If everything goes well, they send you a check list with technical requirements for the interpreter’s working station. The main points on the list are as follows:
Laptops: Intel Core i5 (or equivalent competing brand), 4GB RAM, Windows 10 or higher. If using Mac, OSX with the latest updated Operating System
Second computer or tablet
Ethernet connection, minimum download speed required is 8mbps, Minimum upload speed required is 4 mbps
USB Professional Microphone (recommended — Yeti Nano)
High-quality noise-cancelling headphones (recommended — Sennheiser HD200 Pro and Sennheiser Earbuds SX 3.00)
Software: Google Chrome and Team Viewer
After you make sure that your equipment complies with the check list, Interprefy provides a training on the platform in the form of a 1 hour individual session with a technical specialist who demonstrates the platform functions and answers all the questions (as distinct from the KUDO training where the specialist is replaced by a training module). After that you are required to pass the test (! KUDO does not have anything like this): you have to interpret a 5 minute video, your interpreting is recorded and then sent to professionals for examination. If everything goes well, you are then added to the data base and cooperation begins. I believe that KUDO does not require to pass a test because they position themselves as an RSI platform and not a translation agency: they invite you to work with your clients using their platform and do not promise placing any orders with you (though, if I am not mistaken, clients also can use their data base to look for an interpreter).
Technical support is available – it is provided by a moderator on the platform.
There is a mobile application for interpreters (for emergency situations when an interpreter cannot connect to the conference via computer) but my experience of using it (though I tried it out only once) proved that switching between channels here takes more time than in a browser on PC and it negatively impacts the quality of interpreting – the pieces of the text are lost. After several minutes of interpreting through a mobile app my partner was asked to take over due to these reasons.
Proceeding to the third RSI platform – VOICEBOXER. It is both an event platform and RSI platform, which means that for online events VOICEBOXER does not need any additional external programs. This is VOICEBOXER interface for event participants:
Fig. 7 VOICEBOXER interface for event participants
You can see two windows here: the central space is reserved for demonstrating slides uploaded in advance, the video on the right displays the speaker. Below you can choose the language – the floor or interpretation into another language. On the left vertical panel there are symbols for raising the hand, list of presenters and attendees, screen sharing and camera settings. Below the video window on the right there is a chat for participants. I found very helpful that slides here are demonstrated in the language of the participant’s preference depending on the language selection. All the messages in the chat can be also translated automatically by clicking on the option “to translate”.
The interface for interpreters looks like as follows:
Fig. 8. VOICEBOXER interface for interpreters
As you can see, VOICEBOXER has 2 video channels: the largest space is taken by the presentation slides and on the right-hand side there is a video window displaying the speaker. At the bottom there is a selection of incoming and outgoing audio channels – 2 by default for each type: floor — main incoming channel and IT (for relay) and EN and FR outgoing languages. Selected channels are coloured in blue and red respectively. On the left vertical panel you can see several more symbols standing for mute function, virtual booth controls to work in pairs – the multi-stage handover function. The active interpreter sends out the request for handover, the passive interpreter confirms his or her readiness and the active interpreter finishes his or her piece and performs hand over:
Fig. 9. Handover process on VOICEBOXER
At the top of the left control panel — slow down reminder for the presenter – a new curious function showing that an interpreter can even orchestrate the process; at the bottom — independent volume control – this can be used to adjust the relative volume of the incoming sound and your partner’s outgoing channel. Thus, it is possible to listen to both the floor and your partner. Below the video feed from the presenter there are several chats – general, for one virtual booth interpreters, for interpreters and a moderator, for all interpreters on the project.
An interesting feature of this platform we have not yet seen is that there is no mic button here – at all. By default, one of interpreters has their mic on when the event starts and the handover switches off the mic of the now passive interpreter and switches it on – for the active one.
The response to requests differs. Sometimes it is good – the answer comes the same day and sometimes not so good – you have to wait for a couple of days. Moreover, it should be noted that the Contact us form on the website is not operational and it takes some effort to find the small print contact e-mail in the bottom right corner.
The onboarding process consists in filling in the registration form on the website. However, I failed to register myself since this form as well as Contact us won’t work…later I was informed that they are currently disabled due to spam and VOICEBOXER is updating them. It should be also mentioned that demo sessions and trainings for interpreters are provided now not by VOICEBOXER itself but by its partners they refer you to.
Technical requirements to the interpreter’s working station:
High-speed Internet connection
Technical support is provided. No mobile application for interpreters.
The next RSI platform is Interactio. It is both an event and RSI platform as well. Like KUDO Interactio supports screen sharing, document uploading, messaging in the chat (for event participants) and polling. It has an interface for participants and for interpreters:
Fig. 10. Interactio interface for participants
Fig. 11. Interactio interface for interpreters
Brief description of the interpreter’s interface:
1 – Video feed
2 – Video feed selection buttons
3 – Chat (for interpreters and a moderator)
4 – Incoming language selection
5 – Incoming audio device (which the interpreter uses for listening)
6 – Status panel (event information, computer status, status messages)
7 – Outgoing audio selection
8 – Outgoing audio device (used for voice capturing)
9 – Logout
10 – Partner button (selected to listen to the partner’s audio channel)
11 – Floor button (selected to listen to the floor)
12 – Microphone on/off
13 – Mute button
14 – Handover
On the interpreters’ screens we see that there are 2 default incoming and outgoing audio channels (coloured panels on the left and on the right), though in theory any number of them can be configured. On the left-hand side – incoming sound channels, on the right-hand side – outgoing. The central panel between them displays important information: current interpretation language, general and take time (helpful function!). Like all RSI platforms, Interactio provides relay and handover functions. For the last function there is a dedicated button on the right-hand side. If the active interpreter presses it, the passive interpreter will see the respective message on the central panel. After that the active interpreter switches off his or her mic and the passive interpreter takes over. Manual handover is also possible: the passive interpreter can switch on the mic (central button MIC) and then the active interpreter’s mic will switch off automatically. Interactio representatives explained that this opportunity was programmed to enable the passive interpreter to take over without waiting for the active interpreter to switch off his or her mic for the sake of emergency situations – for example, if the active interpreter is unwell and cannot complete the handover properly.
The response to requests: before the first revision of this article was published, it took Interactio 2 days (and 2 e-mails) to answer. And I could not do the training on the platform: they said that the training is provided only after the order for your language pair is placed and you are appointed to the job (and this is strange since how can I recommend my client to use this platform if I do not really know much about it?). However, after the article was published and reached them, Interactio called me and offered to do a training. As a result, I could better describe the platform functionality in this updated article. Interactio blames our ineffective communication on the pressure due to increasing workload and it is quite understandable if we remember how many companies are going online these days. They promise to do better next time.
Technical requirements for the interpreter’s working station:
Ethernet connection, ping (reaction time) should be less than 100 ms; download and upload bandwidth should be equal or greater than 10 Mbps
Professional hardwired headset
There is no mobile application for interpreters. However, Interactio is at the testing stage of an application to be downloaded to the PC or laptop and used for participating in the event or interpreting alongside the browser version – this is a new feature we have not yet seen before.
SPEAKUS and VERSPEAK
The last two RSI platforms we will look into in this overview are SPEAKUS and VERSPEAK. They are identical in terms of functionality – the company has recently divided into two brands – therefore, we will analyze them together. This is a dedicated RSI platform, so for online events it connects to external programs. This is how SPEAKUS and VERSPEAK interfaces look like:
Fig. 12. SPEAKUS interface
Fig. 13. VERSPEAK interface
This interface looks simpler than that of other platforms. However, all main functions are there. It is possible to configure several video channels, including the presenter’s slides, and switch between them. There are 2 incoming and 2 outgoing channels: EN и RU (or ZH) buttons for outgoing channels and RELAY/RU/EN (at the bottom) and DE ZH/ SPEAKER for incoming channels (DE ZH/ SPEAKER – floor sound; RELAY/RU/EN — relay from English or Russian in case the floor language changes). ON AIR button switches on the interpreter’s mic (red – on, grey – off). MUTE button is used for short-term switching off the mic. Small black squares below channel buttons indicate the sound streaming. The green light in their top right corners means that the sound is streaming. Below the green squares there is a chat for interpreters with hot buttons. Though, as distinct from other platforms, you cannot communicate with a moderator here. For this purpose, a special Whatsapp chat is created. At the bottom there is one (or two – if there are two interpreters) more windows – the interpreters can see and hear each other here even if no one is on air at the moment. This is also a distinctive feature of SPEAKUS and VERSPEAK platforms. The question is whether it is convenient and necessary. In my view, the microphone indicator and the partner’s sound when on air is quite enough (like in Interprefy and KUDO). Interpreters’ video and sound feed can be disabled if needed. One more specific feature here: if you accidently switch off the sound on your side, your partner won’t be able to switch it on. It happened to me and my partner once: I accidently switched off my sound and later when I was interpreting did not understand that she does not hear me and does not know when she can take over. Since there is no indicator of a switched on microphone on the your partner’s side, the ability to hear him or her is crucial.
Correspondingly, here you can listen to two channels at once – the floor and your partner.
There is no automatic handover function here. However, in my opinion, it is not a flaw. As I have already mentioned, a too complicated handover mechanism is counter-intuitive and only distracts interpreters from the interpreting process.
The response to requests of SPEAKUS and VERSPEAK managers – the answer comes within an hour. Probably, the fact that these are Russian companies has a role to play here – they are naturally closer to us, Russian users.
The onboarding process is quite simple: you agree on a demo session where you can see the interface for yourself and get the answers for all the questions. It takes about half an hour. No testing is required.
Technical requirements to the interpreter’s working station:
High power computer, Intel Core I3 or higher, 4GB RAM, Windows 8.0 or higher
Ethernet connection, minimum speed 10Mbps
Comfortable USB noise-cancelling headset with microphone
Technical support is provided via external applications (Whatsapp).
No mobile application for interpreters.
What does Zoom offer? It is common knowledge that Zoom is an online platform for holding meetings and conferences. I believe that everyone has seen Zoom interface:
Fig. 14. Zoom interface
Now Zoom offers a new function – interpretation – available under Pro-plan with optional add-on plan Add Video Webinars. It is possible to enable 9 language channels here (apart from the floor). The conference host can enable this function and assign an interpreter. This way he or she creates an additional audio channel. The listeners can select the channel (language) they want to listen to. For more details on this function, please, see the respective section on Zoom’s website: https://support.zoom.us/hc/en-us/articles/360034919791-Language-interpretation-in-meetings-and-webinars#h_ead62b5f-f2e1-44cd-aa62-58a4bc2e8bcf. The only thing that is missing in Zoom as distinct from dedicated RSI platforms is relay. And the interpreters cannot hear each other: it greatly complicates the handover process. They have to connect via external channel (any messenger or Skype) and hold the call throughout the conference to hear each other and ensure seamless handover. The second option is to connect to the Zoom conference from the second device as a participant and listen to the interpretation channel.
Technical requirements for the interpreter’s working station are not mentioned anywhere. Technical support is missing (the client has to undertake this responsibility himself). The response to requests: very slow, more than a week (obviously, due to peak demand these days). The interpreters who tried working with Zoom say that everything works well and sound quality is good. However, sometimes they face various issues (for example, the host fails to assign an interpreter). Zoom says that the interpretation function is at a testing stage yet, which means that errors are inevitable. Zoom has a mobile application that can be used for interpreting as well.
Thus, we have compared 6 most popular RSI platforms and 1 event-platform Zoom by 11 criteria. I hope that this information will be helpful and now you know a little more about what remote simultaneous interpreting (RSI) is and how this function can be technically implemented.
Interprefy usually has a day and a half-day rate. Although now, since many companies are forced to move their short events and meetings online, they have designed a special proposal: 200 USD for 1 hour (respectively, 400 USD for 2 hours, etc.). 1 working day (8 hours) will be 1600 USD. There is also a referral program: they offer a commission to partners bringing clients.
VOICEBOXER prices: 1 hour – 250 USD (technical support included), one day (up to 8 working hours) – 1000 USD (2 language channels). VOICEBOXER offers partners a 20% commission.
Interactio offers the following pricing: 1 hour – 450 USD (minimum order), 1 day (8 working hours) – 490 USD, technical support — + 45 USD/h. (50 attendees, 2 language channels).
KUDO proved to be the most expensive platform: 150 USD per 1 hour with a 2 hour minimum order (taking into account equipment check and preparation) and 85 USD per 1 hour for technical support, i.e. 770 USD in total for 2 hours. However, in March and April (may be, it will be prolonged) due to the pandemic, to support the businesses, they offer a special discount 50%, i.e. the price for 2 hours will be 470 USD (which is still higher that that of Interprefy). 1 day (which they count as 10 hours for whatever reason) will cost you 2225 USD.
SPEAKUS offers the following prices – the basic tariff: 1 hour – 103 USD, 1 day (8 hours) – 250 USD (50 attendees, 3 language channels, technical support). They also have the antivirus tariff: 1 hour – 30 USD, 1 day (8 hours) – 103 USD (20 attendees, 2 language channels, technical support). (prices are in RUB, recalculated to USD as of 08.06.2020)
VERSPEAK has the most attractive prices: 1 hour – 130 USD, 1 day (8 hours) – 250 USD (50 attendees, 2 language channels, technical support). For the Covid-19 period VERSPEAK offers partners a 50% commission on every order. (prices are in RUB, recalculated to USD as of 08.06.2020)
In Zoom interpretation function is available under Pro-plan with optional add-on plan Add Video Webinars. The total cost of such plan amounts to 54,99 USD per month.
These are the prices for the minimum package (minimum number of attendees and language channels according to the platforms’ price list) for your reference. For precise calculation of the platforms’ cost for your event, please, contact the platforms’ representatives.
Interface for event’s participants (yes/no)
Default number of video and audio channels
Possibility to listen to the floor and your partner simultaneously
Response to requests
1 day (often – 1 hour)
2 days (2 e-mails)
Training module on the platform website, demo sessions, profile on the website
Skype interview, testing, individual training session
Registration form on the website
Questionnaire on the website, demo sessions
Price (1 hour)
Price (1 day)
Table 1. RSI platforms and Zoom. Zoom prices are provided per month.
Natalia Fedorenkova, Lomonosov State Moscow University Alumnus (2007), 2007-2017 In-House Interpreter (Gazprom, McDermott); since 2017 — Freelance Conference Interpreter. Since 2019 — Remote Simultaneous Interpreter with Interprefy. Speaker at III Global Dialogue International Forum of Conference Interpreters. Organizer of RSI Webinars for Interpreters and Business.