The brief was simple on paper: warm, welcoming, make someone picture themselves walking the shingle before they've booked anything. I recorded the script three ways. The first was straight and measured - correct, professional, forgettable. The second leaned warm, almost smiling through the read. The third I slowed by roughly 15%, with a deliberate half-beat of space before the words "Suffolk coast" each time they appeared - the kind of pause you'd naturally give a place name you actually love, rather than one you're reading off a page.
The client picked the third without hesitation. When I asked why, the answer wasn't about technique - it was "it sounded like you meant it." That's the part worth sitting with: the difference between version one and version three wasn't diction, projection, or equipment. It was 1.2 seconds of silence, placed on purpose, four times across a 90-second script.
Out of curiosity, I ran the same script through two AI voice generation tools afterwards. Both produced technically flawless reads - correct pacing, no fluffed words, broadcast-clean audio. Neither one paused before "Suffolk coast." They couldn't, not really, because that pause wasn't a pacing rule to apply; it was a judgement about what a place means to the person reading it. You can program timing. You can't program affection.
That's the test I'd suggest if you're choosing a voice for tourism, hospitality, or anything where the listener needs to feel welcomed rather than informed: read the script aloud yourself first, twice, and notice where you naturally pause. If a voice - human or otherwise - skips that pause, it's not wrong exactly. It's just not telling the truth about how much the place is loved. That gap is small on the page and enormous in the listener's ear.
