Skip to content

[WIP] Text to speech mixin#405

Draft
duwudi wants to merge 23 commits into
masterfrom
tts
Draft

[WIP] Text to speech mixin#405
duwudi wants to merge 23 commits into
masterfrom
tts

Conversation

@duwudi
Copy link
Copy Markdown
Contributor

@duwudi duwudi commented Jul 28, 2021

Closes #376

Note on dependencies

pyfestival Python dependency is provided here into our apt repository, with a patch that is needed in order for import to work on RPi. Workflow run that added it into the apt repo is here. There should be no need to keep https://github.com/pi-top/pyfestival/ at this point.

Example Usage

from pitop import Pitop

pt = Pitop()
pt.speak("Hello")

# change voice to British English (default is "us")
pt.speak.set_voice("english")

# non-blocking voice request
pt.speak("Hello", blocking=False)

Architecture

Uses a generalized object factory that registers backend services we wish to make available - currently we only have "DEFAULT" and "FESTIVAL" but in future we'll ideally have "GOOGLE", "AMAZON" etc. The builders will accept the kwargs relevant to that service and ignore the rest, this should make it flexible when we add new services that require API keys etc.

Options for changing tts backend

One downside to this approach is that the speech service can't be changed easily from the Pitop() class, one way to do it is as follows:

from pitop import Pitop
from pitop.processing import tts

my_speech_service = speech.services.get("GOOGLE")

pitop = Pitop()
pitop.speech = my_speech_service 

Alternatively, we could pass in the speech service to Pitop class:

pitop = Pitop(tts_service_id ="FESTIVAL")

But it feels a shame to add parameters to the Pitop() class. We could also use a class method:

pitop = Pitop.using_speech_service("FESTIVAL")

which is probably a cleaner approach.

To do

  1. Use a factory pattern for allowing different speech backends to be swapped in as it's very likely we'll want to change the TTS engine in future (for compatibility or general improvement)
  2. Discuss if speech service settings will be included in the Pitop.from_config() usage
  3. Fix SIOD ERROR: the currently assigned stack limit has been exceeded error when using the non-blocking mode (thread)

@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 28, 2021

Codecov Report

Merging #405 (5807d2c) into master (854e835) will decrease coverage by 3.97%.
The diff coverage is 60.00%.

❗ Current head 5807d2c differs from pull request most recent head ef0a566. Consider uploading reports for the commit ef0a566 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #405      +/-   ##
==========================================
- Coverage   58.56%   54.58%   -3.98%     
==========================================
  Files          72       81       +9     
  Lines        2756     3270     +514     
==========================================
+ Hits         1614     1785     +171     
- Misses       1142     1485     +343     
Flag Coverage Δ
unittests 54.58% <60.00%> (-3.98%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pitop/system/pitop.py 50.00% <50.00%> (-7.15%) ⬇️
pitop/core/mixins/supports_speech.py 57.14% <57.14%> (ø)
pitop/core/mixins/__init__.py 100.00% <100.00%> (ø)
pitop/robotics/simple_pid/PID.py
pitop/robotics/simple_pid/__init__.py
pitop/miniscreen/oled/core/__init__.py 100.00% <0.00%> (ø)
pitop/miniscreen/buttons/buttons.py 47.69% <0.00%> (ø)
pitop/miniscreen/oled/oled.py 33.46% <0.00%> (ø)
pitop/miniscreen/miniscreen.py 48.33% <0.00%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b4d30d...ef0a566. Read the comment docs.

Comment thread pitop/processing/speech/text_to_speech.py Outdated
@duwudi duwudi changed the title Text to speech mixin [WIP] Text to speech mixin Aug 10, 2021
@duwudi duwudi requested a review from angusjfw August 11, 2021 08:24
Comment thread debian/py3dist-overrides
systemd_python python3-systemd; PEP386
wget python3-wget; PEP386
pyzmq python3-zmq; PEP386
pyfestival python3-pt-pyfestival
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pyfestival python3-pt-pyfestival
pyfestival python3-pyfestival; PEP386

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP386 flag will make the dependency versioned, which we want in most cases.

Comment thread debian/control
pt-device-manager (<< 4.0.0),
# pt-oled
python3-pt-oled (<< 3.0.0),
# Festival speech engine
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently under Replaces: field, which is definitely not what you want to be doing.

Are you trying to make this a core dependency, an optional dependency or a dependency of the full SDK?

Comment thread setup.py
##################
# Text to Speech #
##################
"pt-pyfestival",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"pt-pyfestival",
"pyfestival",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not install correctly for people installing via Python on RPi for now, but we want tackle that later on.

@m-roberts
Copy link
Copy Markdown
Contributor

I am not convinced with the design pattern. Why should the user need to know anything about festival? Why does the user need to handle the importing and management of TTS directly?

from pitop import Pitop

pitop = Pitop()
pitop.speak("Hello World")
# Another option might be a 'speaker' property of the pi-top:
# pitop.speaker.speak("Hello World")

seems like a much more sensible way to approach this.

@m-roberts
Copy link
Copy Markdown
Contributor

This is currently failing to build the package as there is a spelling mistake in this commit's message that is propagating into the snapshot version changelog.

This draft PR provides an easy way for all package builds to ignore changelog spelling mistakes for snapshot builds with a simple boolean input. This hasn't been tested, and this PR is a good test of this functionality.

We should update this PR to use this experimental branch until it passes, then merge in and build against this new master commit. Once this is working here, we can update ALL package build workflows to ignore typos for snapshot builds moving forwards.

@duwudi
Copy link
Copy Markdown
Contributor Author

duwudi commented Aug 25, 2021

I am not convinced with the design pattern. Why should the user need to know anything about festival? Why does the user need to handle the importing and management of TTS directly?

from pitop import Pitop

pitop = Pitop()
pitop.speak("Hello World")
# Another option might be a 'speaker' property of the pi-top:
# pitop.speaker.speak("Hello World")

seems like a much more sensible way to approach this.

Yeah this is exactly how it is now, though I do like the “speaker” idea as we can add more speaker-specific commands to that too (beeps, songs, pre-recorded voices etc)

@m-roberts
Copy link
Copy Markdown
Contributor

pyfestival is currently failing to build: https://github.com/pi-top/pi-topOS-Upstream-Package-Build/runs/3455683743?check_suite_focus=true

We will need to create a patch for the changes needed for this build to pass in CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add text-to-speech functionality

3 participants