- A short review on voice transformations at IRCAM.
- P. Lanchantin, S. Farner, C. Veaux, G. Degottex, A. Roebel,
and X. Rodet.
- In Proc. of the First International Workshop on Performative Speech
and Singing Synthesis, Vancouver, Canada, Mar. 2011
[Preprint]
-
- IRCAM has a long experience in analysis,
synthesis and transformation of voice. Natural voice transformations
are of great interest for many applications and can be combine with
text-to-speech system, leading to a powerful creation tool. We
present research conducted at IRCAM on voice transformations for the
last few years. Transformations can be achieved in a global way by
modifying pitch, spectral envelope, durations etc. While it
sacrifices the possibility to attain a specific target voice, the
approach allows the production of new voices of a high degree of
naturalness with different gender and age, modified vocal quality,
or another speech style. These transformations can be applied in
realtime using ircamTools TRAX. Transformation can also be done in a
more specific way in order to transform a voice towards the voice of
a target speaker. Finally, we present some recent research on the
transformation of expressivity.
- Also presented at The 14th International Conference on Digital Audio Effects
(DAFx-11), Paris, France, Sept. 2011
[Preprint]
- Ensemble hand-clapping experiments under the influence of delay
and various acoustic environments
- S. Farner, A. Solvang, A. Sæbø, and P. Svensson
- Journal of the Audio Engineering Society, 57 (12), Dec. 2009,
pp. 1028-1041
-
-
This study presents hand-clapping experiments performed to increase the knowledge about distributed musical performance with an inter-musician sound delay up to 68 ms in virtual reverberant and anechoic environments as well as in real reverberant environments. Four reactions to increasing delay were studied: tempo decrease, imprecision, leader-follower strategy, and ensemble performance quality as judged by the subjects. The results suggest that the behavior changes at two different delays of about 20 and 35–50 ms, respectively. The influence of acoustical environment was ambiguous and needs further study.
- Natural transformation of type and nature of the voice for extending
vocal repertoire in high-fidelity applications
- S. Farner, A. Röbel, and X. Rodet
- In Proc. of the 35th International AES Conference (Audio for Games), London, UK, Feb 2009
-
- Natural voice transformation will reduce the need
for authentic voices in many situations, ranging from vocal services
via education and entertainment to artistic
applications. Transformation of one voice to correspond to that of
another person has been studied for decades but still suffers from
limitations that we propose to overcome by an alternative approach. It
consists in modifying pitch, spectral envelope, durations etc. in a
global way. While it sacrifices the possibility to attain a specific
target voice, the approach allows the production of new voices of a
high degree of naturalness with different sex and age, modified vocal
quality (soft, breathy, and whisper), or another speech style
(dullness and eagerness). The transformation of sex and age has been
evaluated by a listening test.
-
[Sound
examples]
- Quantifying the strategy taken by a pair of
ensemble hand-clappers under the influence of delay.
- N. Darabi, P. Svensson, and S. Farner.
- In Proc. of the 125th AES Convention, San Francisco, CA, USA, Oct 2008
-
- Pairs of subjects were placed in two acoustically
isolated rooms clapping together under an influence of delay up to 68
ms. Their trials were recorded and analyzed based on a definition of
compensation factor or CF. This parameter was calculated from the
recorded observations for both performers as a discrete function of
time and thought of as a measure of the strategy taken by the subjects
while clapping. Increasing the delay CF was shown to be increased
linearly as it is desired to avoid tempo decrease for such high
latencies. Theoretically a critical value for CF was defined as tempo
over measure (or beat) duration and was used to explain why very short
latencies may lead to a tempo acceleration in accordance with Chafe
effect.
- Voice transformation and speech synthesis for video games.
- S. Farner, Ch. Veaux, G. Beller, X. Rodet, and L. Ach
- Presented at Paris Game Developers Conference, Paris, France, June 2008
-
- Voice and expressivity transformation as well as
text-to-speech synthesis with high degree of naturalness are now
available. A set of tools permitting a large range of voices to be
made from a single voice, whose speech may be produced from text and
given a certain expressivity, is proposed. In the context of
multiplayer video games, for instance, this technology allows for
creation of the speech of non-player characters as well as for
transforming the player’s voice into the voice of her character. The
technology behind these tools will be presented. A demonstration using
cartoon characters will also be provided.
- [Description]
[Presentation]
[Sound
examples]
- Electrophysiological Study of Algorithmically Processed
Metric/Rhythmic Variations in Language and Music
- S. Ystad, C. Magne, S. Farner, G. Pallone, M. Aramaki, M. Besson,
and R. Kronland-Martinet
- EURASIP Journal on Audio, Speech, and Music Processing,
vol. 2007, Article ID 30194, 13 pages, 2007, doi:10.1155/2007/30194
-
- This work is the result of an interdisciplinary
collaboration between scientists from the fields of audio signal
processing, phonetics and cognitive neuroscience aiming at studying
the perception of modifications in meter, rhythm, semantics and
harmony in language and music. A special time-stretching algorithm
was developed to work with natural speech. In the language part,
French sentences ending with tri-syllabic congruous or incongruous
words, metrically modified or not, were made. In the music part,
short melodies made of triplets, rhythmically and/or harmonically
modified, were built. These stimuli were presented to a group of
listeners that were asked to focus their attention either on
meter/rhythm or semantics/harmony and to judge whether or not the
sentences/melodies were acceptable. Language ERP analyses indicate
that semantically incongruous words are processed independently of
the subject's attention thus arguing for automatic semantic
processing. In addition, metric incongruities seem to influence
semantic processing. Music ERP analyses show that rhythmic
incongruities are processed independently of attention, revealing
automatic processing of rhythm in music.
- Ensemble hand-clapping experiments under the influence of delay
and various acoustic environments
- S. Farner, A. Solvang, A. Sæbø, and P. Svensson.
- In Proc. of the 121st AES Convention , Preprint No. 6905, San Francisco, CA, USA, Oct 2006.
-
- Hand-clapping experiments were performed by pairs
of subjects under
the influence of a delay up to 68 ms in various acoustic
environments. The mean tempo decreased close to linearly as
function of the delay. During each sequence the tempo slowed down to
a degree that increased with the delay but for delays shorter than
about 15-23 ms, the tempo increased during the sequence. For the
timing imprecision, and for the subjects' judgements of their own
ensemble performance, no effect of the delay could be observed up to
20 ms. Above 32 ms the effects were observed to increase
with the delay. Virtual anechoic conditions lead to a higher
imprecision than the reverberant conditions, and real-reverberation
conditions lead to a slightly lower tempo.
- [Preprint © AES] [Poster]
- Timbre variations as an attribute of naturalness in clarinet
play
- S. Farner, R. Kronland-Martinet, T. Voinier, and S. Ystad.
- Computer Music Modeling and Retrieval. Third International Symposium
CMMR 2005, Pisa, Italy, September 2005. Published in
Lecture Notes in
Computer Science, Vol. 3902,
Springer-Verlag,
May 2006, pp. 45-53, ISBN 3-540-34027-0
-
- A digital clarinet played by a human and timed by a
metronome was used to record two playing control parameters, the
breath control and the reed displacement, for 20 repeated
performances. The regular behaviour of the parameters was extracted
by averaging and the fluctuation was quantified by the standard
deviation. It was concluded that the movement of the parameters
seem to follow rules. When removing the fluctuations of the
parameters by averaging over the repetitions, the result sounded
less expressive, although it still seemed to be played by a human.
The variation in timbre during the play, in particular within a
note's duration, was observed and then fixed while the natural
temporal envelope was kept. The result seemed unnatural, indicating
that the variation of timbre is important for the naturalness.
- [Preprint © Springer]
- Contribution to harmonic balance calculations of self-sustained
periodic oscillations with focus on
single-reed instruments
- S. Farner, C. Vergez, J. Kergomard, and A. Lizée
- Journal of the Acoustical Society of America,
119 (March 2006), pp. 1794-1804
-
-
The harmonic balance method (HBM) was originally developed for
finding periodic solutions of electronical and mechanical systems
under a periodic force, but has later been adapted to self-sustained
musical instruments. Unlike time-domain methods, this
frequency-domain method does not capture transients and so is not
adapted for sound synthesis. However, its independence of time makes
it very useful for studying every periodic solution of this system,
whether stable or unstable, without care of particular initial
conditions in time. A computer program for solving general problems
involving nonlinearly coupled exciter and resonator, "Harmbal",
has been developed based on the HBM. The method as well as
convergence improvements and continuations facilities are thoroughly
presented and discussed in the present paper. Application of the
method is demonstrated especially on problems with severe
difficulties of convergence, i.e. the Helmholtz motion (square
signals) of single-reed instruments when no losses are taken into
account, the reed being modelled as a simple spring.
- Comparing spectral distance measures for join cost optimization
in concatenative speech synthesis.
- I. Bjørkan, T. Svendsen, and S. Farner
- In Proc. of Interspeech 2005, Lisboa, Portugal,
Sept 2005, pp. 2577-2580
-
- In concatenative synthesis the join cost function
can be related to the probability of a perceived discontinuity at
the join. Therefore it is important that the distance measures in
the cost function correlate highly with human perceived
discontinuities. In this paper the results of a listening test on
joins in two Norwegian long vowels: /A:/ and /e:/, is
presented. Five spectral distance measures and the F0 difference are
compared as predictors of the human perceived discontinuities using
Receiver Operating Characteristic (ROC) curves. In addition, a
linear join cost function is optimized by means of stepwise linear
regression.
- [Preprint]
- Comparison of rhythmic processing in language and music: an
interdisciplinary approach.
- C. Magne, M. Aramaki, C. Astesano, R. L. Gordon, S. Ystad,
S. Farner, R. Kronland-Martinet, and M. Besson.
- Journal of Music and Meaning 3,
Fall 2004/Winter 2005, sec.5.1 (Online journal)
-
- In this paper we describe an interdisciplinary
collaboration between phoneticians, acousticians and
neuroscientists that led to a study of rhythm in music and
language. In the first part of the paper we discuss general
aspects of rhythm, with a short overview of some earlier studies
on the cultural influences of linguistic rhythm on musical
rhythm. In the second part, we describe an experimental procedure
aimed at comparing the perception of rhythmic and semantic
violations in language with the perception of rhythmic and
harmonic violations in music. Subjects listened to different
sentences and melodies and were asked to focus on either rhythm or
semantics/harmony to indicate whether or not the last
word/arpeggio was acceptable or not in the context. The
Event-Related Brain Potential method was used to study perceptual
and cognitive processing related to the rhythmic and
semantic/harmonic incongruities. The results indicated that the
processing of rhythmic incongruities was associated with increased
positive deflections in the Brain Potential in similar latency
bands in both language and music. However, these positive
components were present independently of the participants’ focus
in the music part while they were only present when the
participants focused on semantics in the language part.
- Some aspects of the harmonic balance method applied to the clarinet
- C. Fritz, S. Farner, and J. Kergomard
- Applied Acoustics, 65 (2004), pp. 1155-1180
-
- The clarinet has been extensively studied by various
theoretical and experimental techniques. In this paper, the harmonic
balance method (HBM), a numerical method mainly working in the
frequency domain, has been applied to solve a simple nonlinear
clarinet model consisting of a linear exciter (for the reed)
nonlinearly coupled to a linear resonator with visco-thermal losses
(for the pipe). A recent and improved implementation of the HBM for
self-sustained instruments has allowed us to study the model
theoretically when including dispersion in the pipe or mass and
damping terms in the reed model. The resulting periodic solutions for
the internal pressure spectrum and the corresponding playing frequency
are shown to align well with previous theoretical and experimental
knowledge of the clarinet. Finally, we present and briefly discuss a
few (probably unstable) oscillation regimes both with the HBM and with
a real clarinet.
- Convergence improvement of the harmonic balance method to
obtain periodic solutions for self-sustained musical instruments
- S. Farner, C. Vergez, and J. Kergomard
- In Proc. of the International Congress on Acoustics (ICA) 2004,
Kyoto, Japan, April 2004, pp. 1429-1432
-
- The harmonic balance method was originally developed
for solving periodic solutions of forced-oscillation electronic
circuits but has later been adapted to self-sustained musical
instruments. A computer program for solving general problems
involving nonlinearly coupled exciter and resonator has been developed
using this method, and the convergence has been improved as well as
continuation. We briefly present the harmonic balance method before
we address one specific problem of convergence due to sampling and
describe the backtracking algorithm used to efficiently reduce the
problem. This improvement facilitates continuation and we demonstrate
the method on a simple model of a clarinet and compare with analytical
results. For example, in the lossless approximation, it is verified
that the method converges toward the Helmholtz motion, and we apply it
to draw the bifurcation diagram when the blowing pressure is
increased.
- [Preprint]
- A new method for the calculation of self-sustained
oscillations: the perturbation of the Helmholtz motion
- J. Kergomard, S. Divoir, S. Farner, and C. Vergez
- In Proc. of Stockholm Music Acoustics Conference (SMAC) 2003,
Stockholm, Sweden, August 6-9, 2003, pp. 397-400
-
-
When losses are ignored, elementary solutions for the classical models of
self sustained instruments, such as reed or bowed string instruments, are
pure square or "rectangular" signals, called Helmholtz motion. When losses
are introduced, round corner signals are obtained, and the calculation
becomes delicate. Ab initio calculation is possible, but methods limited to
the steady-state regime make it easier to study the influence of the parameters
on the spectrum and the playing frequency: the harmonic balance is well
known, but, because losses are small, another iterative technique is
suggested. Considering e.g. reed instruments, the Fourier components of the
input pressure signal can be divided into two parts: the components with
high input impedance, and those with low input impedance (corresponding to
the missing harmonics of the rectangular signal). A perturbation method
can be obtained by starting from infinite and zero impedances, respectively.
A key point is that at each step, frequency is fixed in order to calculate
the perturbation, then a new value is calculated using any equation of the
harmonic balance system, an excellent candidate being the reactive power
defined by Boutillon. In this preliminary study, results are compared for a
simplified problem to those of the harmonic balance method, and they are
very interesting, especially far from the oscillation thresholds.
- [Preprint]
- Influence of rhytmic, melodic, and semantic violations in
language and music on the electrical activity in the brain
- S. Ystad, C. Magne, S. Farner, G. Pallone, V. Pasdeloup,
R. Kronland-Martinet, and M. Besson
- In Proc. of Stockholm Music Acoustics Conference (SMAC) 2003,
Stockholm, Sweden, August 6-9, 2003, pp. 671-674
-
-
The work presented here is part of a larger multidisciplinary
project associating audio signal processing, linguistics, and
cognitive neurosciences. It aims at comparing and better
understanding how music and language are processed in the
brain. From a music and speech synthesis point of view, this is
important when striving for naturalness and expressiveness in
synthesized music and language. As a first experiment towards this
goal we have manipulated the rhythm in music and language as well
as harmony and semantics, respectively. Since we wanted to work
with natural speech, we developed and used a method to extend a
given part of an audio signal without altering the timbre. This
allows manipulations of the syllable lengths in the language
part. In the music part we used piano tones from a sampler as a
first approach. The note duration and melody of the musical
sequences were digitally modified by altering the MIDI codes. In
the language part of the experiment, participants were presented
sentences where the final word was either semantically congruous
or incongruous (e.g., I take coffee with sugar/dog ). In the
musical sequences, the final part of the melody was either in or
out of tonality. Moreover, the penultimate (second last) syllable
or note was either of natural duration or increased in duration,
in order to produce rhythmic incongruities. Thus, the two factors
rhythm and semantics/tonality were independently
manipulated. Changes in brain electrical activity were measured
from 28 electrodes on the scalp using an Event-Related Potential
method. Preliminary results show that similar reactions can be
observed in language and music, at least for rhythmic violations.
- [Preprint]
- Remelting by Continuous Feeding of Rolled Scrap into a
Melt
- Snorre Farner, Frede Frisvold, and Thorvald Abel Engh
- Light Metals 2000,
The
Minerals, Metals & Materials Society, pp. 699-704
-
-
Metal losses during remelting is common when recycling aluminium.
Reduction of these losses could give a substantial economic gain.
Experiments with continuous feeding of aluminium plates into
molten aluminium have been performed. A simple steady-state
mathematical model has been developed that gives the temperature
profile and the penetration depth into the melt as a function of
the feeding velocity, superheat, and the heat-transfer
coefficients from melt to solid and from a solidified shell to the
plate. A criterion for shell formation is also formulated. The
results can be applied to understand more complex systems where
shredded scrap is fed into molten aluminium. The model presented
could be of direct interest when feeding rolled scrap into molten
aluminium.
- [Preprint © TMS]
- Evidence for unconventional superconductivity of
Sr2RuO4 from specific-heat measurements.
- S. Nishizaki, Y. Maeno, S. Farner, S. Ikeda, and T. Fujita.
- J. Phys. Soc. Japan, 67 (1998), pp. 560-563
-
-
We measured the specific heat of single crystals of non-cuprate
layered perovskite superconductor
Sr2RuO4. The crystals with different
Tc up to 1.2 K exhibit a large residual electronic
coefficient γ0, which decreases systematically
with increasing Tc. This behavior is consistent with
the presence of nodes in the superconducting gap and with the
variations of Tc due to pair breaking by impurities and
defects. To quantitatively account for the observed large
γ0, however, we need to introduce additional
mechanism.
- Pairing symmetry of superconducting Sr2RuO4
from specific heat measurements.
- S. Nishizaki, Y. Maeno, S. Farner, S. Ikeda, and T. Fujita.
- Physica C, 282 (1997), pp. 1413-1414
-
-
We report the low temperature specific heat of single crystals of
non-cuprate layered perovskite superconductor
Sr2RuO4 (Tc~1K). In this
paper we focus on the relation between the residual value
γ0 of the electronic specific heat and the
specific-heat jump across Tc. The results
provide strong evidence for unconventional superconducting state.