SoundStreak: A Conversation with CEO Dan Caligor about the Technology Set to Unseat ISDN for Voice Over
NYCPPNEWS recently spoke with Dan Caligor, CEO of SoundStreak (www.soundstreak.com), which offers a service that seems set to replace the costly, complicated ISDN setups that have been standard for recording remote voice talent. After its initial launch a few years ago that faltered as funding dried up, CEO Dan Caligor assembled a new team, building on the innovations of his initial partner, David Coleman of NBC.
In 2011, Caligor re-launched the New York-based company which is now in its initial roll-out phase. We had a long conversation with Dan Caligor to learn more about the technology and how it might be applied to voice-over sessions as well as other possible uses.
Dan Caligor: SoundStreak is a platform for recording audio; at this moment in particular we’re talking about voice actors over the Internet. It is a substitute for ISDN as a connection between two remote locations, which until quite recently was pretty much the industry standard. Source-Connect is also in this space, although I discourage direct comparisons except as a starting point to get people to understand what SoundStreak actually does.
NYCPPNEWS: Why would you say your product is not comparable to theirs?
Dan Caligor: There are some key differences in what SoundStreak does and how that makes it not directly comparable to SourceConnect. Let me give you some context first. From what you’ve read on our website, do you feel that you understand how SoundStreak works?
NYCPPNEWS: Yes, as you have a useful step-by-step diagram that gives a good sense of things. I know too that the Internet’s connections are a lot better today than they were in years past, for example how latency has been reduced throughout. So that would make your product more reliable.
Dan Caligor: Okay. Let me talk about how SoundStreak works. SoundStreak is designed to be used with any broadband connection and to make the best use of however much bandwidth there is on that connection. SoundStreak is, especially in its current incarnation, able to work with the standard consumer broadband connection and run on any consumer laptop or desktop machine.
SoundStreak is designed to be used with any broadband connection and to make the best use of however much bandwidth there is on that connection. SoundStreak is, especially in its current incarnation, able to work with the standard consumer broadband connection and run on any consumer laptop or desktop machine.
Currently it is only available for the Mac, but it is designed as a cross platform product. We are in the process of developing the Windows version. Our number one objective is to minimize the amount of bandwidth required while simultaneously not compromising on audio quality. The solution we have is to use the talent’s computer as a local capture device. So when you are in a SoundStreak session with me, for example, we are talking over a voice patch and simultaneously hearing and seeing the playback of any tapes or backing assets, and so on.
However, we are not streaming anything. Everything is local playback. When I do a take with you, and you are the talent, the take is captured to your hard drive, and I am listening as you do your performance. The take is transferred in low resolution very quickly so that we can review local copies of it together for the reading and any background noise and things like that.
If we decide it’s a “buy” take, meaning a take that we want to use, it is transferred from the talent to the producer’s machine automatically. So you get a result of very little bandwidth required because all you actually are passing back and forth in real-time is the voice patch, which is essentially VOIP (voice over internet protocol) and signals for control, record, and playback.
Those are the only things that are happening in real-time; everything else is happening asynchronously in the background. So for instance files are moving over FTP or whatever speed the network will bear, but the whole experience is very real time and synchronous. That is I the producer press ‘play’, and you and I see the backing asset played back in sync with the take that has just been recorded. And we both see it from local copies. Are you with me so far?
Dan Caligor: Okay. You asked about some of the differences on how we view ourselves versus how we view SourceConnect. The first is that we are a service, not a product. Our software is free, and SoundStreak signup is free. Although we are not yet charging people, the standard pricing we refer to — flat fee per session — is “retail.” Other pricing structures will be offered for high volume users
NYCPPNEWS: What range do you think it will be?
Dan Caligor: We have not yet announced that, but it will be volume dependent. Right now you can get an account on SoundStreak, which anyone can do by the way, it is an open beta and it’s a very advanced beta, it’s really a release candidate. Anybody can go get an account. When you get that account, you are also given 10 free tokens. It’s basically one session–one token and when you get down to one or two tokens it will give you ten more until we turn on pricing. At some point you will have to buy the tokens and they will be whatever they cost.
I am not being evasive about that we want to watch people use it and talk to them about how they are using before we set the price levels. We don’t really, some people seem to want long sessions, some want short sessions, some use it a lot, some use it a little. We want to figure out a model that is fair to everyone.
…we consider ourselves a service and not a product and that has a lot of implications of how we deal with our customers and how our customers pay for what they get from us.
Difference number one, coming back to SourceConnect, is that we consider ourselves a service and not a product and that has a lot of implications of how we deal with our customers and how our customers pay for what they get from us. There is no software updating and there is no fee associated with anything we provide except the session.
The second difference is that we are server/client model, in comparison to streaming models like SourceConnect.SourceConnect is streaming your audio real-time from point A to point B. It is compressing it to do so and if you lose your take in transit, you basically lose your take.
SoundStreak is recording the take locally, transferring it to our server and then transferring it to the recipient, which right now is the producer but in future versions can be a third party. And that matters for a few reasons…the first is that we traverse corporate firewalls very elegantly. People in corporate environments only have to deal with one IP address, which is our server. Even if they are dealing with lots of different talent from all over the world, all those talent are connecting to our server and being routed to them.
So we can traverse firewalls without compromising IT security policies. The second reason it is important is because everything that passes across our server is archived. That means the take files are archived on our server, you can get them later if you need them. It means all the information about the session is stored and can be tracked and recalled and manipulated and used which has all kinds of practical accounting and other applications.
It also means that we can enable a bunch of cool features. For instance at the end of a SoundStreak sessions, say you and I have done one, if you were the producer, you would get an email as soon as the session ended that said “You just finished a SoundStreak session and it was 17 minutes long, it had 5 takes, the producer was Dan, the talent was Dan, of the five takes three were transferred. Here are notes that the producer took during the session, here are the durations of each take and for each of the transferred takes here is the link.”
So if you click on that link it downloads the take from our server. In addition to having it on your machine as the producer, you can send that email to the director and to your edit room if it is somewhere else. So you can send that email to any one and your editor or engineer can simply download, which takes you tell them to.
NYCPPNEWS: I’m curious as to how the audio is dealt with on the talent side. Do you have to make sure they have a good microphone, record in a good room? And what if there is background noise?
Dan Caligor: There is a presumption that one of two things are the case: either the audio environment of the talent is of sufficiently high quality or in some cases the application, the reason for which the audio is being recorded doesn’t require terribly high quality.
Mic placement and all of that is up to the talent if they are doing it from a home studio, but many of the models we see people using this for include using a professional studio at the talent end while the producer, the director, the engineer, the editor, whoever is not in a studio but at home or on location using a laptop or desktop.
The short answer to your question is yes talent has to be able to set up their audio equipment or has to be in a place where there is an engineer to do it for them. We have found that in many segments in the voiceover market, home studios are virtually ubiquitous. They are being more and more common because the hardware to have a home studio or for that matter the hardware to do independent remote sessions is getting much cheaper, and of course the software is also always getting cheaper.
Where a home studio used to be an elaborate and expensive it is becoming progressively cheaper and easier, especially with SoundStreak because you don’t need ISDN modem. Now SoundStreak does read to picture and it also allows the talent and production person to see that backing asset, that video that they are reading to and a script simultaneously. So again the whole thing is designed so that the talent and production person are experiencing the same thing at all times.
NYCPPNEWS: So is that something that is happening on the talents computer and the producer’s computer and it syncs somehow to video?
Dan Caligor: It is not actually syncing, it’s local in both cases. Remember what I said we want to be able to operate with low bandwidth. What happens is at the beginning of the session the backing assets are downloaded to your computer as fast as your link will download them. If you are using a slow internet connection it’s probably asymmetrical meaning your download speed is faster than your upload speed so that works well.
Then once the assets are on your machine and also on the producers machine, when the producer presses play or record all that is going from our server to the local machine is a command–a few bytes–that says ‘start playing, start recording’ and the playback of the asset and recording of the resulting audio is all done locally.
NYCPPNEWS: Is there time code involved? How are you syncing to the image?
Dan Caligor: Right now we record in AIFF,WAVE and Broadcast Wave. Any of those formats will play back with a video asset just using first frame. So if you are recording to a 30 second video asset and you record a take, when you play back the take, it will just roll them at the same time. If you a time coded source or backing asset, whether it’s audio or video and you record in Broadcast Wave you can pass the time code through. So that allows you to do thing like ADR and you know record drop-ins that will drop in into a nonlinear timeline.
NYCPPNEWS: Where do you see the initial interest in this service coming from?
Dan Caligor: We think of our market as having three distinct segments. The first we call enterprise users and right now this seems to be mainly television and cable networks that do a lot of on air promotion and a lot of different kinds of voice over. To some degree it also includes other sorts of industry verticals like advertising.
We define that enterprise market as being high volume users in an in-house setting. So if you’re the on air promo department at one of the big three networks, you have guys that just record on air promos all day and SoundStreak is a better mousetrap than that.
The second segment is independents. These are people who if they wanted to do a remote session before SoundStreak needed to either invest heavily in hardware and software or needed to go and rent a studio at one or both ends of the session. Now they can just use SoundStreak and for them this is a tremendous cost saving. This either reduces their budget tremendously or makes remote sessions possible where they were not within reach before.
Between these two we have studios that we think of as essentially resellers of SoundStreak. So the way they are going to use it is as a substitute for ISDN. They are typically going to either rent out a room or provide a bundle of services including recording, mixing and so on. Today they might hand their customer–whether it be an ad agency or a TV network that’s outsourced something to them–a bill that has eight line items on it. Maybe they are subcontracting a room somewhere else that also has its own line items on it. SoundStreak would become one of those line items.
Now for those guys it is a little more complicated because honestly there are a lot of cases where previously you would have needed two rooms and now you only need one. And that’s still sort of playing itself out in the market.
NYCPPNEWS: You have a Mac product, so when you get the Windows platform done you’ll be ready to launch?
Dan Caligor: (The product) will be much more powerful that way. The other thing I would say that from a business perspective, we are very intent on making it easy to try SoundStreak because we really think it is the best alternative out there. We’re making it as effortless and frictionless as possible to try SoundStreak. We want to make it so you don’t have to think of it at all. You can download our software and activate your SoundStreak membership and start doing a session in less than 10 minutes. We don’t want to muddy the waters with charging for it. So that’s really the reason that we will describe ourselves as in beta. It’s not because the product is not fully baked. It’s that we want people to find it a no brainer to try.
NYCPPNEWS: What do you see as a particular challenge in this next stage?
Dan Caligor: I see the main challenge as its adoption; SoundStreak is a much better mousetrap. But you won’t know that until you’ve used it and you are clear about its features and reliability.
The biggest challenge is getting people to focus on it long enough to try it and see how great it is. Fortunately there are many addressable markets for us right now, everything from video game production to on air promotion and (feature) animation. Each of those is a relatively small and self-contained little market segment and therefore fairly easy to reach. So I am not so worried about advertising and marketing.
There are always two ends at every session. Every time a SoundStreak session happens there is a voice actor and a producer/director and each of those people will work later on with other producers or talent. The word can spread pretty virally.
Talent by the way loves this because it makes their own studios cheaper; it gives them a lot of freedom. You can take a really high quality sort of portable studio on the road with you with a low end MacBook Air and a briefcase box. It dead simple to use and it doesn’t cost them anything. Each session is paid for only once and in most cases that single payment is provided by the production end.
NYCPPNEWS: Now let me understand this: although the session is paid via this token system, the assets are held on your servers. So if people want continuing access do they keep downloading it or you just transfer everything off and say it’s yours, deal with it or how does that work?
Dan Caligor: Great question. In our current formulation your take assets are being archived on our server. You can download them any time from the links in the session summary, as long as you have that email. If you lose the email or for some reason can’t retrieve them, you can to ask us to retrieve them for you. Right now it’s essentially a safety measure. We are building a portal that will allow people to do a number of things including go revisit old sessions and download assets. At some point we may offer service levels or something like that. Eventually, we may establish a policy of archiving (backing them up then removing them from the server) assets after some period if the server is overloaded. Right now however, we are not worried about that. It’s not a problem to just keep all of the assets on our server.
NYCPPNEWS: But then again they are audio-sized files so they are not necessarily very large.
Dan Caligor: They are not huge but you can record in SoundStreak at up to 96 kHz at 24 bits, which is a crazy high-resolution for such audio. I can’t think of any reasons to want to do that but you could if you wanted. Whatever resolution you recorded in that is what is going to be on our server and that is what is going to be on your production server as well.
NYCPPNEWS: Besides just voice-overs, could you record musicians too?
Dan Caligor: That is an often-asked question. We are not a good tool for people in different locations jamming together. You would not want to do a multi-location recording with SoundStreak. Frankly, as far as I can tell that’s true with any current technology.
NYCPPNEWS: I was thinking more about just a one-ended session. Let’s say there are some great musicians in the Ukraine and I want to use them for this ad and we will get them into a room to play something.
Dan Caligor: That is a dead on description of our best use in music, to either allow somebody to remotely monitor a session and automatically receive the transferred assets or for somebody to provide a sound bed and have a musician for instance lay a track over it.
Now the key of use cases for SoundStreak in its current form are: first, one or both ends of the session are remote, meaning the talent or the production person or the client who they want to monitor the session are not all in the same place – that is number one. And Second there is some degree of supervision or monitoring. If you just want to record a session by yourself, supervised or unsupervised in the next version of SoundStreak you’ll be able to. For instance, we’re planning to offer products for auditions. It will allow people to do unsupervised recordings that will be automatically delivered to casting professionals. Although if that’s all you want to do is to record an audio file and send it to somebody you can do that just as well with Garageband and email it if you want, if you are determined.
SoundStreak is really targeting people who are interested in collaborative remote sessions and who probably have time sensitivities and want high quality. One of the key differences between the workflow of SoundStreak and say with ISDN is that SoundStreak actually records each take as its own audio file and automatically labels it with a naming convention.
SoundStreak is really targeting people who are interested in collaborative remote sessions and who probably have time sensitivities and want high quality. One of the key differences between the workflow of SoundStreak and say with ISDN is that SoundStreak actually records each take as its own audio file and automatically labels it with a naming convention. It will include the name of the talent and the date and the name of the session and so on.
So there has been a lot of attention paid to work flow and to just having the take files show up in the right place and be easy to identify and self-contained. You know each one has a timing so when you get that session summary email it says ‘Take three, 32 seconds’, it has the comments that the production person might have put in the take list as they were doing the session and it has a link to the actual take.
NYCPPNEWS: The product sounds very useful. Good luck.
Dan Caligor: Thanks for your interest.