Jekyll2020-04-22T08:18:42+00:00https://mortendahl.github.io/feed.xmlCryptography and Machine LearningMixing both for privacy-preserving machine learningGrowing TF Encrypted2019-05-17T10:00:00+00:002019-05-17T10:00:00+00:00https://mortendahl.github.io/2019/05/17/growing-tf-encrypted<p>What <a href="/2018/03/01/secure-computation-as-dataflow-programs/">started out as a side project</a> less than two years ago is growing up and moving into its <a href="https://github.com/tf-encrypted">own organization on GitHub</a>!</p>
<p>The tremendous growth we have seen would not have been possible without partner contributors, and with this move TF Encrypted is being cemented as an independent community project that can encourage participation and remain focused on its mission: getting privacy-enhancing tools into the hands of machine learning practitioners.</p>
<p><em>This is a <a href="https://medium.com/dropoutlabs/growing-tf-encrypted-a1cb7b109ab5">cross-posting</a> of work done at <a href="https://dropoutlabs.com">Dropout Labs</a>. A big thank you to <a href="https://twitter.com/gavinuhma">Gavin Uhma</a>, <a href="https://twitter.com/ianlivingstone">Ian Livingstone</a>, <a href="https://twitter.com/jvmancuso">Jason Mancuso</a>, and <a href="https://twitter.com/m__maclellan">Matt Maclellan</a> for help with this post.</em></p>
<h1 id="a-framework-for-encrypted-deep-learning">A Framework for Encrypted Deep Learning</h1>
<p>TF Encrypted makes it easy to apply machine learning to data that remains encrypted at all times. It builds on, and integrates heavily, with TensorFlow, providing a familiar interface and encouraging mixing ordinary and encrypted computations. Together this ensures a pragmatic and gradual approach to a maturing technology.</p>
<p><img src="/assets/tfe/tfe-architecture.png" alt="" /></p>
<p>The core consists of <a href="https://en.wikipedia.org/wiki/Secure_multi-party_computation">secure computation</a> optimized for deep learning, as well as standard deep learning components adapted to work more efficiently on encrypted data. However, the whole purpose is to abstract all of this away.</p>
<p>As an example, the following code snippet shows how one can serve <a href="/2018/10/19/experimenting-with-tf-encrypted/">predictions on encrypted inputs</a>, in this case using a small neural network. It closely resembles traditional TensorFlow code, with the exception of <code class="language-plaintext highlighter-rouge">tfe.define_private_input</code> and <code class="language-plaintext highlighter-rouge">tfe.define_output</code> that are used to express our desired privacy policy: that only the client should be able to see the input and the result in plaintext, and everyone else must only see them in an encrypted state.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="kn">import</span> <span class="nn">tf_encrypted</span> <span class="k">as</span> <span class="n">tfe</span>
<span class="k">def</span> <span class="nf">provide_weights</span><span class="p">():</span> <span class="s">"""Load model weight from disk using TensorFlow."""</span>
<span class="k">def</span> <span class="nf">provide_input</span><span class="p">():</span> <span class="s">"""Load and preprocess input data locally on the client."""</span>
<span class="k">def</span> <span class="nf">receive_output</span><span class="p">(</span><span class="n">logits</span><span class="p">):</span> <span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="k">print</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">logits</span><span class="p">))</span>
<span class="n">w0</span><span class="p">,</span> <span class="n">b0</span><span class="p">,</span> <span class="n">w1</span><span class="p">,</span> <span class="n">b1</span><span class="p">,</span> <span class="n">w2</span><span class="p">,</span> <span class="n">b2</span> <span class="o">=</span> <span class="n">provide_weights</span><span class="p">()</span>
<span class="c1"># run provide_input locally on the client and encrypt
</span><span class="n">x</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span><span class="s">"prediction-client"</span><span class="p">,</span> <span class="n">provide_input</span><span class="p">)</span>
<span class="c1"># compute prediction on the encrypted input
</span><span class="n">layer0</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">relu</span><span class="p">(</span><span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">w0</span><span class="p">)</span> <span class="o">+</span> <span class="n">b0</span><span class="p">)</span>
<span class="n">layer1</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">relu</span><span class="p">(</span><span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer0</span><span class="p">,</span> <span class="n">w1</span><span class="p">)</span> <span class="o">+</span> <span class="n">b1</span><span class="p">)</span>
<span class="n">logits</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer1</span><span class="p">,</span> <span class="n">w2</span><span class="p">)</span> <span class="o">+</span> <span class="n">b2</span>
<span class="c1"># send results back to client, decrypt, and run receive_output locally
</span><span class="n">prediction_op</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_output</span><span class="p">(</span><span class="s">"prediction-client"</span><span class="p">,</span> <span class="n">receive_output</span><span class="p">,</span> <span class="n">logits</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tfe</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">prediction_op</span><span class="p">)</span>
</code></pre></div></div>
<p>Below we can see that TF Encrypted is also a natural fit for <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/federated-learning">secure aggregation</a> in <a href="https://ai.googleblog.com/2017/04/federated-learning-collaborative.html">federated learning</a>. Here, in each iteration, gradients are computed locally by data owners using ordinary TensorFlow. They are then given as encrypted inputs to a secure computation of their mean, which in turn is revealed to the model owner who updates the model.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># compute and collect all model gradients as private inputs
</span><span class="n">model_grads</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="p">[</span>
<span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span>
<span class="n">data_owner</span><span class="o">.</span><span class="n">player_name</span><span class="p">,</span>
<span class="n">data_owner</span><span class="o">.</span><span class="n">compute_gradient</span><span class="p">)</span>
<span class="k">for</span> <span class="n">data_owner</span> <span class="ow">in</span> <span class="n">data_owners</span>
<span class="p">])</span>
<span class="c1"># compute mean gradient securely
</span><span class="n">aggregated_model_grads</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">tfe</span><span class="o">.</span><span class="n">add_n</span><span class="p">(</span><span class="n">grads</span><span class="p">)</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">grads</span><span class="p">)</span>
<span class="k">for</span> <span class="n">grads</span> <span class="ow">in</span> <span class="n">model_grads</span>
<span class="p">]</span>
<span class="c1"># reveal only aggregated gradients to model owner
</span><span class="n">iteration_op</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_output</span><span class="p">(</span>
<span class="n">model_owner</span><span class="o">.</span><span class="n">player_name</span><span class="p">,</span>
<span class="n">model_owner</span><span class="o">.</span><span class="n">update_model</span><span class="p">,</span>
<span class="n">aggregated_model_grads</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tfe</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_iterations</span><span class="p">):</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">iteration_op</span><span class="p">)</span>
</code></pre></div></div>
<p>Because of tight integration with TensorFlow, this process can easily be profiled and visualized using <a href="https://www.tensorflow.org/guide/graph_viz">TensorBoard</a>, as shown in the <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/federated-learning">full example</a>.</p>
<p>Finally, is it also possible to perform <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/logistic">encrypted training</a> on joint data sets. In the snippet below, two data owners provide encrypted training data that is merged and subsequently used as any other data set.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x_train_0</span><span class="p">,</span> <span class="n">y_train_0</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span>
<span class="n">data_owner_0</span><span class="o">.</span><span class="n">player_name</span><span class="p">,</span>
<span class="n">data_owner_0</span><span class="o">.</span><span class="n">provide_training_data</span><span class="p">)</span>
<span class="n">x_train_1</span><span class="p">,</span> <span class="n">y_train_1</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span>
<span class="n">data_owner_1</span><span class="o">.</span><span class="n">player_name</span><span class="p">,</span>
<span class="n">data_owner_1</span><span class="o">.</span><span class="n">provide_training_data</span><span class="p">)</span>
<span class="n">x_train</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">concat</span><span class="p">([</span><span class="n">x_train_0</span><span class="p">,</span> <span class="n">x_train_1</span><span class="p">],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="n">y_train</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">concat</span><span class="p">([</span><span class="n">y_train_0</span><span class="p">,</span> <span class="n">y_train_1</span><span class="p">],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>The <a href="https://github.com/tf-encrypted/tf-encrypted">GitHub repository</a> contains several <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples">more examples</a>, including <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/notebooks">notebooks</a> to help you get started.</p>
<h1 id="moving-forward-as-a-community">Moving Forward as a Community</h1>
<p>Since the beginning, the motivation behind TF Encrypted has been to explore and unlock <a href="/2017/09/19/private-image-analysis-with-mpc/">the impact of privacy-preserving machine learning</a>; and the approach taken is to help practitioners get their hands dirty and experiment.</p>
<p>A sub-goal of this is to help improve communication between people within different areas of expertise, including creating a common vocabulary for more efficient knowledge sharing.</p>
<p>To really scale this up we need to bring as many people as possible together, as this means a better collective understanding, more exploration, and more identified use cases. And the only natural place for this to happen is where you feel comfortable and encouraged to contribute.</p>
<h2 id="use-cases-under-constraints">Use cases under constraints</h2>
<p>Getting data scientists involved is key, as the technology has reached a maturity where it can be applied to real-world problems, yet is still not ready to simply be treated as a black box; even for solving problems that on paper may otherwise seem like a perfect fit.</p>
<p>Instead, to further bring the technology out from research circles, and find the right use cases given current constraints, we need people with domain knowledge to benchmark on the problems they face, and report on their findings.</p>
<p>Helping them get started quickly, and reducing their learning curve, is a key goal of TF Encrypted.</p>
<h2 id="cross-disciplinary-research">Cross-disciplinary research</h2>
<p>At the same time, it is important that the runtime performance of the underlying technology continues to improve, as this makes more use cases practical.</p>
<p>The most obvious way of doing that is for researchers in cryptography to <a href="https://guutboy.github.io/">continue the development</a> of secure computation and its adaptation to deep learning. However, this currently requires them to gain an intuition into machine learning that most do not have.</p>
<p>Orthogonal to improving <em>how</em> the computations are performed, another direction is to improve <em>what</em> functions are computed. This means adapting machine learning models to the encrypted setting and essentially treating it as a new type of computing device with its own characteristics; for which some operations, or even model types, are more suitable. However, this currently requires an understanding of cryptography that most do not have.</p>
<p>Forming a bridge that helps these two fields collaborate, yet stay focused on their area of expertise, is another key goal of TF Encrypted.</p>
<!--
but we could still get closer to the performance and scalability of plaintext computations. This is especially true when we consider the monetary cost, as current techniques require more hardware resources and do not yet welcome e.g. the use of GPUs to the same extent.
-->
<h2 id="common-platform">Common platform</h2>
<p>Frameworks like TensorFlow have shown the benefits of bringing practitioners together on the same software platform. It makes everything concrete, including vocabulary, and shortens the distance from research to application. It makes everyone move towards the same target, yet via good abstractions allows each to focus on what they do best while still benefiting from the contributions of others. In other words, it facilitates taking a modular approach to the problem, lowering the overhead of everyone first developing expertise across all domains.</p>
<p>All of this leads to the core belief behind TF Encrypted: that we can push the field of privacy-preserving machine learning forward by building a common and integrated platform that makes tools and techniques for encrypted deep learning easily accessible.</p>
<p>To do this we welcome partners and contributors from all fields, including companies that want to leverage the accumulated expertise while keeping their focus on all the remaining questions around for instance taking this all the way to production.</p>
<h1 id="challenges-and-roadmap">Challenges and Roadmap</h1>
<p>Building the current version of TF Encrypted was only the first step, with many interesting challenges on the road ahead. Below are a select few with more up-to-date status in the <a href="https://github.com/tf-encrypted/tf-encrypted/issues">GitHub issues</a>.</p>
<h2 id="high-level-api">High-level API</h2>
<p>As seen earlier, the interface of TF Encrypted has so far been somewhat low-level, roughly matching that of TensorFlow 1.x. This ensured user familiarity and gave us a focal point for adapting and optimizing cryptographic techniques.</p>
<p>However, it also has shortcomings.</p>
<p>One is that expressing models in this way has simply become outdated in light of high-level APIs such as <a href="https://keras.io/">Keras</a>. This is also evident in the <a href="https://medium.com/tensorflow/whats-coming-in-tensorflow-2-0-d3663832e9b8">upcoming TensorFlow 2.x</a> which fully <a href="https://www.tensorflow.org/guide/keras">embraces Keras</a> and similar abstractions.</p>
<p>The second is related to why Keras has likely become so popular, namely its ability to express complex models succinctly and closely to how we think about them. This management of complexity only becomes more relevant when you add notions of distributed data with explicit ownership and privacy policies.</p>
<p>Thirdly, with a low-level API it is easy for users to shoot themselves in the foot and accidentally use operations that are very expensive in the encrypted space. Obtaining good results and figuring out which cryptographic techniques work best for a particular model typically requires some expertise, yet with a low-level API it is hard to incorporate and distribute such knowledge.</p>
<p>As a way of mitigating these issues, we are adding a high-level API to TF Encrypted closely matching Keras, but extended to work nicely with the concepts and constraints inherent in privacy-preserving machine learning. Although still a work in progress, one might imagine rewriting the first example from above as follows.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="kn">import</span> <span class="nn">tf_encrypted</span> <span class="k">as</span> <span class="n">tfe</span>
<span class="k">class</span> <span class="nc">PredictionClient</span><span class="p">:</span>
<span class="o">@</span><span class="n">tfe</span><span class="o">.</span><span class="n">private_input</span>
<span class="k">def</span> <span class="nf">provide_input</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="s">"""Load and preprocess input data."""</span>
<span class="o">@</span><span class="n">tfe</span><span class="o">.</span><span class="n">private_output</span>
<span class="k">def</span> <span class="nf">receive_output</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">logits</span><span class="p">):</span>
<span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="k">print</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">logits</span><span class="p">))</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">Sequential</span><span class="p">([</span>
<span class="n">tfe</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">tfe</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">tfe</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="n">activation</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="p">])</span>
<span class="n">prediction_client</span> <span class="o">=</span> <span class="n">PredictionClient</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">prediction_client</span><span class="o">.</span><span class="n">provide_input</span><span class="p">()</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="n">prediction_client</span><span class="o">.</span><span class="n">receive_output</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
</code></pre></div></div>
<p>We believe that distilling concepts in this way will improve the ability to accumulate knowledge while retaining a large degree of flexibility.</p>
<h2 id="pre-trained-models">Pre-trained models</h2>
<p>Taking the above mindset further, we also want to encourage the use of pre-trained models and fine-tuning when possible. These provide the least flexibility for users but offer great ways for accumulating expertise and lower user investments.</p>
<p>We plan on providing several well-known models adapted to the encrypted space, thus offering good trade-offs between accuracy and speed.</p>
<h2 id="tighter-tensorflow-integration">Tighter TensorFlow integration</h2>
<p>Being in the TensorFlow ecosystem has been a huge advantage, providing not only the familiarity and hybrid approach already mentioned, but also allowing us to benefit from an <a href="https://www.tensorflow.org/deploy/distributed">efficient distributed platform</a> with extensive support tools.</p>
<p>As such, it is no surprise that we want full support for one of the most exciting <a href="https://medium.com/tensorflow/whats-coming-in-tensorflow-2-0-d3663832e9b8">changes coming with TensorFlow 2.x</a>, and the improvements to debugging and exploration that comes with it: <a href="https://www.tensorflow.org/guide/eager">eager evaluation</a> by default. While completely abandoning static dataflow graphs would likely have a significant impact on performance, we expect to find reasonable compromises through the new <a href="https://www.tensorflow.org/alpha/guide/autograph"><code class="language-plaintext highlighter-rouge">tf.function</code></a> and static sub-components.</p>
<p>We are also very excited to explore how TF Encrypted can work together with other projects such as <a href="https://www.tensorflow.org/federated">TensorFlow Federated</a> and <a href="https://github.com/tensorflow/privacy">TensorFlow Privacy</a> by adding secure computation to the mix. For instance, TF Encrypted can be used to realize <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/federated-learning">secure aggregation</a> for the former, and can provide a complementary approach to privacy with respect to the latter.</p>
<h2 id="more-cryptographic-techniques">More cryptographic techniques</h2>
<p>TF Encrypted has been focused almost exclusively on secure computation based on secret sharing up until this point. However, in certain scenarios and models there are several other techniques that fit more naturally or offer better performance.</p>
<p>We are keen on incorporating these by providing wrappers of some of the excellent <a href="https://github.com/rdragos/awesome-mpc/">projects that already exist</a>, making it easier to experiment and benchmark various combinations of techniques and parameters, and define good defaults.</p>
<h2 id="push-the-boundaries">Push the boundaries</h2>
<p>Most <a href="https://medium.com/dropoutlabs/privacy-preserving-machine-learning-2018-a-year-in-review-b6345a95ae0f">research on encrypted deep learning</a> has so far focused on relatively simple models, typically with fewer than a handful of layers.</p>
<p><a href="https://arxiv.org/abs/1512.03385"><img src="/assets/tfe/resnet.png" style="width: 75%" /></a></p>
<p>Moving forward, we need to move beyond toy-like examples and tackle more models commonly used in real-world image analysis and in other domains such as natural language processing. Having the community settle on a few such models will help increase outside interest and bring the field forward by providing a focal point for research.</p>
<h2 id="data-science-workflow">Data science workflow</h2>
<p>While some constraints are currently due to technical maturity, others seem inherent from the fact we now want to keep data private. In other words, even if we had perfect secure computation, with the same performance and scalability properties as plaintext, then we still need to figure out and potentially adapt how we do e.g. data exploration, feature engineering, and production monitoring in the encrypted space.</p>
<p>This area remains largely unexplored and we are excited about digging in further.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Having seen TF Encrypted grow and create interest over the past two years has been an amazing experience, and it is only becoming increasingly clear that the best way to push the field of privacy-preserving machine learning forward is to bring together practitioners from different domains.</p>
<p>As a result, development of the project is now officially by <em>The TF Encrypted Authors</em> with specific attribution given via the Git commit history. For situations where someone needs to take the final decision I remain benevolent dictator, working towards the core beliefs outlined here.</p>
<p>Learn more and become part of the development on <a href="https://github.com/tf-encrypted/tf-encrypted">GitHub</a>! đźš€</p>
<style>
img {
margin-left: auto;
margin-right: auto;
}
</style>Morten DahlWhat started out as a side project less than two years ago is growing up and moving into its own organization on GitHub!Experimenting with TF Encrypted2018-10-19T12:00:00+00:002018-10-19T12:00:00+00:00https://mortendahl.github.io/2018/10/19/experimenting-with-tf-encrypted<p>Privacy-preserving machine learning offers many benefits and interesting applications: being able to train and predict on data while it remains in encrypted form unlocks the utility of data that were previously inaccessible due to privacy concerns. But to make this happen several technical fields must come together, including cryptography, machine learning, distributed systems, and high-performance computing.</p>
<p>The <a href="https://tf-encrypted.io">TF Encrypted</a> open source project aims at bringing researchers and practitioners together in a familiar framework in order to accelerate exploration and adaptation. By building directly on <a href="https://www.tensorflow.org/">TensorFlow</a> it provides a high performance framework with an easy-to-use interface that abstracts away most of the underlying complexity, allowing users with only a basic familiarity with machine learning and TensorFlow to apply state-of-the-art cryptographic techniques without first becoming cross-disciplinary experts.</p>
<p>In this blog post we apply the library to a traditional machine learning example, providing a good starting point for anyone wishing to get into this rapidly growing field.</p>
<p><em>This is a <a href="https://medium.com/dropoutlabs/experimenting-with-tf-encrypted-fe37977ff03c">cross-posting</a> of work done at <a href="https://dropoutlabs.com">Dropout Labs</a> with <a href="https://twitter.com/jvmancuso">Jason Mancuso</a>.</em></p>
<h1 id="tensorflow-and-tf-encrypted">TensorFlow and TF Encrypted</h1>
<p>We start by looking at how our task can be solved in standard TensorFlow and then go through the changes needed to make the predictions private via TF Encrypted. Since the interface of the latter is meant to simulate the simple and concise expression of common machine learning operations that TensorFlow is well-known for, this requires only a small change that highlights what one must inherently think about when moving to the private setting.</p>
<p>Following standard practice, the following script shows our two-layer feedforward network with ReLU activations (more details in the <a href="https://arxiv.org/abs/1810.08130">preprint</a>).</p>
<p>Concretely, we consider the classic <a href="http://yann.lecun.com/exdb/mnist/">MNIST digit classification task</a>. To keep things simple we use a small neural network and train it in the traditional way in TensorFlow using an unencrypted training set. However, for making predictions with the trained model we turn to TF Encrypted, and show how two servers can perform predictions for a client without learning anything about its input. While this is a basic yet somewhat standard benchmark in the literature, the techniques used carry over to many different use cases, including medical image analysis.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="c1"># generic functions for loading model weights and input data
</span><span class="k">def</span> <span class="nf">provide_weights</span><span class="p">():</span> <span class="s">"""Load model weights as TensorFlow objects."""</span>
<span class="k">def</span> <span class="nf">provide_input</span><span class="p">():</span> <span class="s">"""Load input data as TensorFlow objects."""</span>
<span class="k">def</span> <span class="nf">receive_output</span><span class="p">(</span><span class="n">logits</span><span class="p">):</span> <span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="k">print</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">logits</span><span class="p">))</span>
<span class="c1"># get model weights/input data (both unencrypted)
</span><span class="n">w0</span><span class="p">,</span> <span class="n">b0</span><span class="p">,</span> <span class="n">w1</span><span class="p">,</span> <span class="n">b1</span><span class="p">,</span> <span class="n">w2</span><span class="p">,</span> <span class="n">b2</span> <span class="o">=</span> <span class="n">provide_weights</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">provide_input</span><span class="p">()</span>
<span class="c1"># compute prediction
</span><span class="n">layer0</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">relu</span><span class="p">((</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">w0</span><span class="p">)</span> <span class="o">+</span> <span class="n">b0</span><span class="p">))</span>
<span class="n">layer1</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">relu</span><span class="p">((</span><span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer0</span><span class="p">,</span> <span class="n">w1</span><span class="p">)</span> <span class="o">+</span> <span class="n">b1</span><span class="p">))</span>
<span class="n">logits</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer2</span><span class="p">,</span> <span class="n">w2</span><span class="p">)</span> <span class="o">+</span> <span class="n">b2</span>
<span class="c1"># get result of prediction and print
</span><span class="n">prediction_op</span> <span class="o">=</span> <span class="n">receive_output</span><span class="p">(</span><span class="n">logits</span><span class="p">)</span>
<span class="c1"># run graph execution in a tf.Session
</span><span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">prediction_op</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that the <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples/mnist">concrete implementation</a> of <code class="language-plaintext highlighter-rouge">provide_weights</code> and <code class="language-plaintext highlighter-rouge">provide_input</code> have been left out for the sake of readability. These two methods simply load their respective values from NumPy arrays stored on disk, and return them as tensor objects.</p>
<p>We next turn to making the predictions private, where for the notion of privacy and encryption to even make sense we first need to recast our setting to consider more than the single party implicit in the script above. As seen below, expressing our intentions about who should get to see which values is the biggest difference between the two scripts.</p>
<p>We can naturally identify two of the parties: the prediction client who knows its own input and a model owner who knows the weights. Moreover, for the secure computation protocol chosen here we also need two servers that will be doing the actual computation on encrypted values; this is often desirable in applications where the clients may be mobile devices that have significant restraints on computational power and networking bandwidth.</p>
<p><img src="/assets/tfe/prediction-flow.png" alt="" /></p>
<p>In summary, our data flow and privacy assumptions are as illustrated in the diagram above. Here a model owner first gives encryptions of the model weights to the two servers in the middle (known as a <em>private input</em>), the prediction client then gives encryptions of its input to the two servers (another private input), who can execute the model and send back encryptions of the prediction result to the client, who can finally decrypt; at no point can the two servers decrypt any values. Below we see our script expressing these privacy assumptions.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="kn">import</span> <span class="nn">tf_encrypted</span> <span class="k">as</span> <span class="n">tfe</span>
<span class="c1"># generic functions for loading model weights and input data on each party
</span><span class="k">def</span> <span class="nf">provide_weights</span><span class="p">():</span> <span class="s">"""Loads the model weights on the model-owner party."""</span>
<span class="k">def</span> <span class="nf">provide_input</span><span class="p">():</span> <span class="s">"""Loads the input data on the prediction-client party."""</span>
<span class="k">def</span> <span class="nf">receive_output</span><span class="p">():</span> <span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="k">print</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">argmax</span><span class="p">(</span><span class="n">logits</span><span class="p">))</span>
<span class="c1"># get model weights/input data as private tensors from each party
</span><span class="n">w0</span><span class="p">,</span> <span class="n">b0</span><span class="p">,</span> <span class="n">w1</span><span class="p">,</span> <span class="n">b1</span><span class="p">,</span> <span class="n">w2</span><span class="p">,</span> <span class="n">b2</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span><span class="s">"model-owner"</span><span class="p">,</span> <span class="n">provide_weights</span><span class="p">)</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_private_input</span><span class="p">(</span><span class="s">"prediction-client"</span><span class="p">,</span> <span class="n">provide_input</span><span class="p">)</span>
<span class="c1"># compute secure prediction
</span><span class="n">layer0</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">relu</span><span class="p">((</span><span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">w0</span><span class="p">)</span> <span class="o">+</span> <span class="n">b0</span><span class="p">))</span>
<span class="n">layer1</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">relu</span><span class="p">((</span><span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer0</span><span class="p">,</span> <span class="n">w1</span><span class="p">)</span> <span class="o">+</span> <span class="n">b1</span><span class="p">))</span>
<span class="n">logits</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">matmul</span><span class="p">(</span><span class="n">layer1</span><span class="p">,</span> <span class="n">w2</span><span class="p">)</span> <span class="o">+</span> <span class="n">b2</span>
<span class="c1"># send prediction output back to client
</span><span class="n">prediction_op</span> <span class="o">=</span> <span class="n">tfe</span><span class="o">.</span><span class="n">define_output</span><span class="p">(</span><span class="s">"prediction-client"</span><span class="p">,</span> <span class="n">receive_output</span><span class="p">,</span> <span class="n">logits</span><span class="p">)</span>
<span class="c1"># run secure graph execution in a tfe.Session
</span><span class="k">with</span> <span class="n">tfe</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">prediction_op</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that most of the code remains essentially identical to the traditional TensorFlow code, using <code class="language-plaintext highlighter-rouge">tfe</code> instead of <code class="language-plaintext highlighter-rouge">tf</code>:</p>
<ul>
<li>
<p>The <code class="language-plaintext highlighter-rouge">provide_weights</code> method for loading model weights is now wrapped in a call to <code class="language-plaintext highlighter-rouge">tfe.define_private_input</code> in order to specify they should be owned and restricted to the model owner; by wrapping the method call, TF Encrypted will encrypt them before sharing with other parties in the computation.</p>
</li>
<li>
<p>As with the weights, the prediction input is now also only accessible to the prediction client, who is also the only receiver of the output. Here the <code class="language-plaintext highlighter-rouge">tf.print</code> statement has been moved into <code class="language-plaintext highlighter-rouge">receive_output</code> as this is now the only point where the result is known in plaintext.</p>
</li>
<li>
<p>We also tie the name of parties to their network hosts. Although omitted here, this information also needs to be available on these hosts, as typically shared via a simple configuration file.</p>
</li>
</ul>
<h1 id="whats-the-point">Whatâ€™s the Point?</h1>
<ul>
<li>
<p><strong>user-friendly</strong>: very little boilerplate, very similar to traditional TensorFlow.</p>
</li>
<li>
<p><strong>abstract and modular</strong>: it integrates secure computation tightly with machine learning code, hiding advanced cryptographic operations underneath normal tensor operations.</p>
</li>
<li>
<p><strong>extensible</strong>: new protocols and techniques can be added under the hood, and the high-level API wonâ€™t change. Similarly, new machine learning layers can be added and defined on top of each underlying protocol as needed, just like in normal TensorFlow.</p>
</li>
<li>
<p><strong>fast</strong>: all of this is computed efficiently since it gets compiled down to ordinary TensorFlow graphs, and can hence take advantage of the optimized primitives for distributed computation that the TensorFlow backend provides.</p>
</li>
</ul>
<p>These properties also make it easy to <strong>benchmark</strong> a diverse set of combinations of machine learning models and secure computation protocols. This allows for more fair comparisons, more confident experimental results, and a more rigorous empirical science, all while lowering the barrier to entry to private machine learning.</p>
<p>Finally, by operating directly in TensorFlow we also benefit from its ecosystem and can take advantage of existing tools such as <a href="https://www.tensorflow.org/guide/graph_viz">TensorBoard</a>. For instance, one can profile which operations are most expensive and where additional optimizations should be applied, and one can inspect where values reside and ensure correctness and security during implementation of the cryptographic protocols as shown below.</p>
<p><img src="/assets/tensorspdz/masking-reuse.png" alt="" /></p>
<p>Here, we visualize the various operations that make up a secure operation on two private values. Each of the nodes in the underlying computation graph are shaded according to which machine aided that nodeâ€™s execution, and it comes with handy information about data flow and execution time. This gives the user a completely transparent yet effective way of auditing secure computations, while simultaneously allowing for program debugging.</p>
<h1 id="conclusion">Conclusion</h1>
<p><a href="https://github.com/tf-encrypted">TF Encrypted</a> is about providing researchers and practitioners with the open-source tools they need to quickly experiment with secure protocols and primitives for private machine learning.</p>
<p>The hope is that this will aid and inspire the next generation of researchers to implement their own novel protocols and techniques for secure computation in a fraction of the time, so that machine learning engineers can start to apply these techniques for their own use cases in a framework theyâ€™re already intimately familiar with.</p>
<p>To find out more have a look at the recent <a href="https://arxiv.org/abs/1810.08130">preprint</a> or dive into the <a href="https://github.com/tf-encrypted/tf-encrypted/tree/master/examples">examples on GitHub</a>!</p>Morten DahlPrivacy-preserving machine learning offers many benefits and interesting applications: being able to train and predict on data while it remains in encrypted form unlocks the utility of data that were previously inaccessible due to privacy concerns. But to make this happen several technical fields must come together, including cryptography, machine learning, distributed systems, and high-performance computing.Secure Computations as Dataflow Programs2018-03-01T12:00:00+00:002018-03-01T12:00:00+00:00https://mortendahl.github.io/2018/03/01/secure-computation-as-dataflow-programs<p><em><strong>TL;DR:</strong> using TensorFlow as a distributed computation framework for dataflow programs we give a full implementation of the SPDZ protocol with networking, in turn enabling optimised machine learning on encrypted data.</em></p>
<p>Unlike <a href="/2017/09/03/the-spdz-protocol-part1/">earlier</a> where we focused on the concepts behind secure computation as well as <a href="/2017/09/19/private-image-analysis-with-mpc/">potential applications</a>, here we build a fully working (passively secure) implementation with players running on different machines and communicating via typical network stacks. And as part of this we investigate the benefits of using a <a href="https://en.wikipedia.org/wiki/Dataflow_programming">modern distributed computation</a> platform when experimenting with secure computations, as opposed to building everything from scratch.</p>
<p>Additionally, this can also be seen as a step in the direction of getting private machine learning into the hands of practitioners, where integration with existing and popular tools such as <a href="https://www.tensorflow.org/">TensorFlow</a> plays an important part. Concretely, while we here only do a relatively shallow integration that doesnâ€™t make use of some of the powerful tools that comes with TensorFlow (e.g. <a href="https://www.tensorflow.org/api_docs/python/tf/gradients">automatic differentiation</a>), we do show how basic technical obstacles can be overcome, potentially paving the way for deeper integrations.</p>
<p>Jumping ahead, it is clear in retrospect that TensorFlow is an obvious candidate framework for quickly experimenting with secure computation protocols, at the very least in the context of private machine learning.</p>
<p><a href="https://github.com/mortendahl/privateml/tree/master/tensorflow/spdz/">All code</a> is available to play with, either locally or on the <a href="https://cloud.google.com/compute/">Google Cloud</a>. To keep it simple our running example throughout is private prediction using <a href="https://beckernick.github.io/logistic-regression-from-scratch/">logistic</a> <a href="https://github.com/ageron/handson-ml/blob/master/04_training_linear_models.ipynb">regression</a>, meaning that given a private (i.e. encrypted) input <code class="language-plaintext highlighter-rouge">x</code> we wish to securely compute <code class="language-plaintext highlighter-rouge">sigmoid(dot(w, x) + b)</code> for private but pre-trained weights <code class="language-plaintext highlighter-rouge">w</code> and bias <code class="language-plaintext highlighter-rouge">b</code> (private training of <code class="language-plaintext highlighter-rouge">w</code> and <code class="language-plaintext highlighter-rouge">b</code> is considered in a follow-up post). <a href="#experiments">Experiments</a> show that for a model with 100 features this can be done in TensorFlow with a latency as low as 60ms and at a rate of up to 20,000 prediction per second.</p>
<p><em>A big thank you goes out to <a href="https://twitter.com/iamtrask">Andrew Trask</a>, <a href="https://twitter.com/korymath">Kory Mathewson</a>, <a href="https://twitter.com/janleike">Jan Leike</a>, and the <a href="https://twitter.com/openminedorg">OpenMined community</a> for inspiration and interesting discussions on this topic!</em></p>
<p><em><strong>Disclaimer</strong>: this implementation is meant for experimentation only and may not live up to required security. In particular, TensorFlow does not currently seem to have been designed with this application in mind, and although it does not appear to be the case right now, may for instance in future versions perform optimisations behind that scene that break the intended security properties. <a href="#thoughts">More notes below</a>.</em></p>
<h1 id="motivation">Motivation</h1>
<p>As hinted above, implementing secure computation protocols such as <a href="/2017/09/03/the-spdz-protocol-part1/">SPDZ</a> is a non-trivial task due to their distributed nature, which is only made worse when we start to introduce various optimisations (<a href="https://github.com/rdragos/awesome-mpc">but</a> <a href="https://github.com/bristolcrypto/SPDZ-2">it</a> <a href="https://github.com/aicis/fresco">can</a> <a href="http://oblivc.org/">be</a> <a href="https://github.com/encryptogroup/ABY">done</a>). For instance, one has to consider how to best orchestrate the simultanuous execution of multiple programs, how to minimise the overhead of sending data across the network, and how to efficient interleave it with computation so that one server only rarely waits on the other. On top of that, we might also want to support different hardware platforms, including for instance both CPUs and GPUs, and for any serious work it is highly valuable to have tools for visual inspection, debugging, and profiling in order to identify issues and bottlenecks.</p>
<p>It should furthermore also be easy to experiment with various optimisations, such as transforming the computation for improved performance, <a href="/2017/09/19/private-image-analysis-with-mpc/#generalised-triples">reusing intermediate results and masked values</a>, and supplying fresh â€śraw materialâ€ť in the form of <a href="/2017/09/03/the-spdz-protocol-part1/#multiplication">triples</a> during the execution instead of only generating a large batch ahead of time in an offline phase. Getting all this right can be overwhelming, which is one reason earlier blog posts here focused on the principles behind secure computation protocols and simply did everything locally.</p>
<p>Luckily though, modern distributed computation frameworks such as <a href="https://www.tensorflow.org/">TensorFlow</a> are receiving a lot of research and engineering attention these days due to their use in advanced machine learning on large data sets. And since our focus is on private machine learning there is a natural large fundamental overlap. In particular, the secure operations we are interested in are tensor addition, subtraction, multiplication, dot products, truncation, and sampling, which all have insecure but highly optimised counterparts in TensorFlow.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>We make the assumption that the main principles behind both TensorFlow and the SPDZ protocol are already understood â€“ if not then there are <a href="https://www.tensorflow.org/tutorials/">plenty</a> <a href="https://learningtensorflow.com/">of</a> <a href="https://github.com/ageron/handson-ml">good</a> <a href="https://developers.google.com/machine-learning/crash-course/">resources</a> for the former (including <a href="https://www.tensorflow.org/about/bib">whitepapers</a>) and e.g. <a href="/2017/09/03/the-spdz-protocol-part1/">previous</a> <a href="/2017/09/10/the-spdz-protocol-part2/">blog</a> <a href="/2017-09-19-private-image-analysis-with-mpc.md">posts</a> for the latter. As for the different parties involved, we also here assume a <a href="/2017/09/19/private-image-analysis-with-mpc/#setting">setting</a> with two server, a crypto producer, an input provider, and an output receiver.</p>
<p>One important note though is that TensorFlow works by first constructing a static <a href="https://www.tensorflow.org/programmers_guide/graphs">computation graph</a> that is subsequently executed in a <a href="https://www.tensorflow.org/api_guides/python/client">session</a>. For instance, inspecting the graph we get from <code class="language-plaintext highlighter-rouge">sigmoid(dot(w, x) + b)</code> in <a href="https://www.tensorflow.org/programmers_guide/graph_viz">TensorBoard</a> shows the following.</p>
<p><img src="/assets/tensorspdz/structure.png" alt="" /></p>
<p>This means that our efforts in this post are concerned with building such a graph, as opposed to actual execution as earlier: we are to some extend making a small compiler that translates secure computations expressed in a simple language into TensorFlow programs. As a result we benefit not only from working at a higher level of abstraction but also from the large amount of efforts that have already gone into optimising graph execution in TensorFlow.</p>
<p>See the <a href="#experiments">experiments</a> for a full code example.</p>
<h1 id="basics">Basics</h1>
<p>Our needs fit nicely with the operations already provided by TensorFlow as seen next, with one main exception: to match typical precision of floating point numbers when instead working with <a href="/2017/09/03/the-spdz-protocol-part1/#fixedpoint-numbers">fixedpoint numbers</a> in the secure setting, we end up encoding into and operating on integers that are larger than what fits in the typical word sizes of 32 or 64 bits, yet today these are the only sizes for which TensorFlow provides operations (a constraint that may have something to do with current support on GPUs).</p>
<p>Luckily though, for the operations we need there are efficient ways around this that allow us to simulate arithmetic on tensors of ~120 bit integers using a list of tensors with identical shape but of e.g. 32 bit integers. And this decomposition moreover has the nice property that we can often operate on each tensor in the list independently, so in addition to enabling the use of TensorFlow, this also allows most operations to be performed in parallel and can actually <a href="https://en.wikipedia.org/wiki/Residue_number_system">increase efficiency</a> compared to operating on single larger numbers, despite the fact that it may initially sound more expensive.</p>
<p>We discuss the <a href="/2018/01/29/the-chinese-remainder-theorem/">details</a> of this elsewhere and for the rest of this post simply assume operations <code class="language-plaintext highlighter-rouge">crt_add</code>, <code class="language-plaintext highlighter-rouge">crt_sub</code>, <code class="language-plaintext highlighter-rouge">crt_mul</code>, <code class="language-plaintext highlighter-rouge">crt_dot</code>, <code class="language-plaintext highlighter-rouge">crt_mod</code>, and <code class="language-plaintext highlighter-rouge">sample</code> that perform the expected operations on lists of tensors. Note that <code class="language-plaintext highlighter-rouge">crt_mod</code>, <code class="language-plaintext highlighter-rouge">crt_mul</code>, and <code class="language-plaintext highlighter-rouge">crt_sub</code> together allow us to define a right shift operation for <a href="/2017/09/03/the-spdz-protocol-part1/#fixedpoint-numbers">fixedpoint truncation</a>.</p>
<h2 id="private-tensors">Private tensors</h2>
<p>Each private tensor is determined by two shares, one of each server. And for the reasons mentioned above, each share is given by a list of tensors, which is represented by a list of nodes in the graph. To hide this complexity we introduce a simple class as follows.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateTensor</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">share0</span> <span class="o">=</span> <span class="n">share0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">share1</span> <span class="o">=</span> <span class="n">share1</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">shape</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">share0</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">shape</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">unwrapped</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">share0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">share1</span>
</code></pre></div></div>
<p>And thanks to TensorFlow we can know the shape of tensors at graph creation time, meaning we donâ€™t have to keep track of this ourselves.</p>
<h2 id="simple-operations">Simple operations</h2>
<p>Since a secure operation will often be expressed in terms of several TensorFlow operations, we use abstract operations such as <code class="language-plaintext highlighter-rouge">add</code>, <code class="language-plaintext highlighter-rouge">mul</code>, and <code class="language-plaintext highlighter-rouge">dot</code> as a convenient way of constructing the computation graph. The first one is <code class="language-plaintext highlighter-rouge">add</code>, where the resulting graph simply instructs the two servers to locally combine the shares they each have using a subgraph constructed by <code class="language-plaintext highlighter-rouge">crt_add</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="n">x0</span><span class="p">,</span> <span class="n">x1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="n">y0</span><span class="p">,</span> <span class="n">y1</span> <span class="o">=</span> <span class="n">y</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">name_scope</span><span class="p">(</span><span class="s">'add'</span><span class="p">):</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">z0</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">y0</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">y1</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">z0</span><span class="p">,</span> <span class="n">z1</span><span class="p">)</span>
<span class="k">return</span> <span class="n">z</span>
</code></pre></div></div>
<p>Notice how easy it is to use <a href="https://www.tensorflow.org/api_docs/python/tf/device"><code class="language-plaintext highlighter-rouge">tf.device()</code></a> to express which server is doing what! This command ties the computation and its resulting value to the specified host, and instructs TensorFlow to automatically insert appropiate networking operations to make sure that the input values are available when needed!</p>
<p>As an example, in the above, if <code class="language-plaintext highlighter-rouge">x0</code> was previous on, say, the input provider then TensorFlow will insert send and receive instructions that copies it to <code class="language-plaintext highlighter-rouge">SERVER_0</code> as part of computing <code class="language-plaintext highlighter-rouge">add</code>. All of this is abstracted away and the framework will attempt to <a href="https://www.tensorflow.org/about/bib">figure out</a> the best strategy for optimising exactly when to perform sends and receives, including batching to better utilise the network and keeping the compute units busy.</p>
<p>The <a href="https://www.tensorflow.org/api_docs/python/tf/name_scope"><code class="language-plaintext highlighter-rouge">tf.name_scope()</code></a> command on the other hand is simply a logical abstraction that doesnâ€™t influence computations but can be used to make the graphs much easier to visualise in <a href="https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard">TensorBoard</a> by grouping subgraphs as single components as also seen earlier.</p>
<p><img src="/assets/tensorspdz/add.png" alt="" /></p>
<p>Note that by selecting <em>Device</em> coloring in TensorBoard as done above we can also use it to verify where the operations were actually computed, in this case that addition was indeed done locally by the two servers (green and turquoise).</p>
<h2 id="dot-products">Dot products</h2>
<p>We next turn to dot products. This is more complex, not least since we now need to involve the crypto producer, but also since the two servers have to communicate with each other as part of the computation.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="n">x0</span><span class="p">,</span> <span class="n">x1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="n">y0</span><span class="p">,</span> <span class="n">y1</span> <span class="o">=</span> <span class="n">y</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">name_scope</span><span class="p">(</span><span class="s">'dot'</span><span class="p">):</span>
<span class="c1"># triple generation
</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">CRYPTO_PRODUCER</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">sample</span><span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">sample</span><span class="p">(</span><span class="n">y</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">ab</span> <span class="o">=</span> <span class="n">crt_dot</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="n">a0</span><span class="p">,</span> <span class="n">a1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="n">b0</span><span class="p">,</span> <span class="n">b1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">b</span><span class="p">)</span>
<span class="n">ab0</span><span class="p">,</span> <span class="n">ab1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">ab</span><span class="p">)</span>
<span class="c1"># masking after distributing the triple
</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">alpha0</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">a0</span><span class="p">)</span>
<span class="n">beta0</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">y0</span><span class="p">,</span> <span class="n">b0</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">alpha1</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">a1</span><span class="p">)</span>
<span class="n">beta1</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">y1</span><span class="p">,</span> <span class="n">b1</span><span class="p">)</span>
<span class="c1"># recombination after exchanging alphas and betas
</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">alpha0</span><span class="p">,</span> <span class="n">alpha1</span><span class="p">)</span>
<span class="n">beta</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">beta0</span><span class="p">,</span> <span class="n">beta1</span><span class="p">)</span>
<span class="n">z0</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">ab0</span><span class="p">,</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">a0</span><span class="p">,</span> <span class="n">beta</span><span class="p">),</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">b0</span><span class="p">),</span>
<span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">beta</span><span class="p">))))</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">alpha0</span><span class="p">,</span> <span class="n">alpha1</span><span class="p">)</span>
<span class="n">beta</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">beta0</span><span class="p">,</span> <span class="n">beta1</span><span class="p">)</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">ab1</span><span class="p">,</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">a1</span><span class="p">,</span> <span class="n">beta</span><span class="p">),</span>
<span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">b1</span><span class="p">)))</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">z0</span><span class="p">,</span> <span class="n">z1</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">truncate</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="k">return</span> <span class="n">z</span>
</code></pre></div></div>
<p>However, with <code class="language-plaintext highlighter-rouge">tf.device()</code> we see that this is still relatively straight-forward, at least if the <a href="/2017/09/19/private-image-analysis-with-mpc/#dense-layers">protocol for secure dot products</a> is already understood. We first construct a graph that makes the crypto producer generate a new dot triple. The output nodes of this graph is <code class="language-plaintext highlighter-rouge">a0, a1, b0, b1, ab0, ab1</code></p>
<p>With <code class="language-plaintext highlighter-rouge">crt_sub</code> we then build graphs for the two servers that mask <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> using <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> respectively. TensorFlow will again take care of inserting networking code that sends the value of e.g. <code class="language-plaintext highlighter-rouge">a0</code> to <code class="language-plaintext highlighter-rouge">SERVER_0</code> during execution.</p>
<p>In the third step we reconstruct <code class="language-plaintext highlighter-rouge">alpha</code> and <code class="language-plaintext highlighter-rouge">beta</code> on each server, and compute the recombination step to get the dot product. Note that we have to define <code class="language-plaintext highlighter-rouge">alpha</code> and <code class="language-plaintext highlighter-rouge">beta</code> twice, one for each server, since although they contain the same value, if we had instead define them only on one server but used them on both, then we would implicitly have inserted additional networking operations and hence slowed down the computation.</p>
<p><img src="/assets/tensorspdz/dot.png" alt="" /></p>
<p>Returning to TensorBoard we can verify that the nodes are indeed tied to the correct players, with yellow being the crypto producer, and green and turquoise being the two servers. Note the convenience of having <code class="language-plaintext highlighter-rouge">tf.name_scope()</code> here.</p>
<h2 id="configuration">Configuration</h2>
<p>To fully claim that this has made the distributed aspects of secure computations much easier to express, we also have to see what is actually needed for <code class="language-plaintext highlighter-rouge">td.device()</code> to work as intended. In the code below we first define an arbitrary job name followed by identifiers for our five players. More interestingly, we then simply specify their network hosts and wrap this in a <a href="https://www.tensorflow.org/deploy/distributed"><code class="language-plaintext highlighter-rouge">ClusterSpec</code></a>. Thatâ€™s it!</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">JOB_NAME</span> <span class="o">=</span> <span class="s">'spdz'</span>
<span class="n">SERVER_0</span> <span class="o">=</span> <span class="s">'/job:{}/task:0'</span><span class="o">.</span><span class="nb">format</span><span class="p">(</span><span class="n">JOB_NAME</span><span class="p">)</span>
<span class="n">SERVER_1</span> <span class="o">=</span> <span class="s">'/job:{}/task:1'</span><span class="o">.</span><span class="nb">format</span><span class="p">(</span><span class="n">JOB_NAME</span><span class="p">)</span>
<span class="n">CRYPTO_PRODUCER</span> <span class="o">=</span> <span class="s">'/job:{}/task:2'</span><span class="o">.</span><span class="nb">format</span><span class="p">(</span><span class="n">JOB_NAME</span><span class="p">)</span>
<span class="n">INPUT_PROVIDER</span> <span class="o">=</span> <span class="s">'/job:{}/task:3'</span><span class="o">.</span><span class="nb">format</span><span class="p">(</span><span class="n">JOB_NAME</span><span class="p">)</span>
<span class="n">OUTPUT_RECEIVER</span> <span class="o">=</span> <span class="s">'/job:{}/task:4'</span><span class="o">.</span><span class="nb">format</span><span class="p">(</span><span class="n">JOB_NAME</span><span class="p">)</span>
<span class="n">HOSTS</span> <span class="o">=</span> <span class="p">[</span>
<span class="s">'10.132.0.4:4440'</span><span class="p">,</span>
<span class="s">'10.132.0.5:4441'</span><span class="p">,</span>
<span class="s">'10.132.0.6:4442'</span><span class="p">,</span>
<span class="s">'10.132.0.7:4443'</span><span class="p">,</span>
<span class="s">'10.132.0.8:4444'</span><span class="p">,</span>
<span class="p">]</span>
<span class="n">CLUSTER</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">ClusterSpec</span><span class="p">({</span>
<span class="n">JOB_NAME</span><span class="p">:</span> <span class="n">HOSTS</span>
<span class="p">})</span>
</code></pre></div></div>
<p><em>Note that in the screenshots we are actually running the input provider and output receiver on the same host, and hence both show up as <code class="language-plaintext highlighter-rouge">3/device:CPU:0</code>.</em></p>
<p>Finally, the code that each player executes is about as simple as it gets.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">server</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">Server</span><span class="p">(</span><span class="n">CLUSTER</span><span class="p">,</span> <span class="n">job_name</span><span class="o">=</span><span class="n">JOB_NAME</span><span class="p">,</span> <span class="n">task_index</span><span class="o">=</span><span class="n">ROLE</span><span class="p">)</span>
<span class="n">server</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="n">server</span><span class="o">.</span><span class="n">join</span><span class="p">()</span>
</code></pre></div></div>
<p>Here the value of <code class="language-plaintext highlighter-rouge">ROLE</code> is the only thing that differs between the programs the five players run and typically given as a command-line argument.</p>
<h1 id="improvements">Improvements</h1>
<p>With the basics in place we can look at a few optimisations.</p>
<h2 id="tracking-nodes">Tracking nodes</h2>
<p>Our first improvement allows us to reuse computations. For instance, if we need the result of <code class="language-plaintext highlighter-rouge">dot(x, y)</code> twice then we want to avoid computing it a second time and instead reuse the first. Concretely, we want to keep track of nodes in the graph and link back to them whenever possible.</p>
<p>To do this we simply maintain a global dictionary of <code class="language-plaintext highlighter-rouge">PrivateTensor</code> references as we build the graph, and use this for looking up already existing results before adding new nodes. For instance, <code class="language-plaintext highlighter-rouge">dot</code> now becomes as follows.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="n">node_key</span> <span class="o">=</span> <span class="p">(</span><span class="s">'dot'</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">nodes</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">node_key</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="k">if</span> <span class="n">z</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="c1"># ... as before ...
</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">z0</span><span class="p">,</span> <span class="n">z1</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">truncate</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="n">nodes</span><span class="p">[</span><span class="n">node_key</span><span class="p">]</span> <span class="o">=</span> <span class="n">z</span>
<span class="k">return</span> <span class="n">z</span>
</code></pre></div></div>
<p>While already significant for some applications, this change also opens up for our next improvement.</p>
<h2 id="reusing-masked-tensors">Reusing masked tensors</h2>
<p>We have <a href="/2017/09/10/the-spdz-protocol-part2/">already</a> <a href="/2017/09/19/private-image-analysis-with-mpc/#generalised-triples">mentioned</a> that weâ€™d ideally want to mask every private tensor at most once to primarily save on networking. For instance, if we are computing both <code class="language-plaintext highlighter-rouge">dot(w, x)</code> and <code class="language-plaintext highlighter-rouge">dot(w, y)</code> then we want to use the same masked version of <code class="language-plaintext highlighter-rouge">w</code> in both. Specifically, if we are doing many operations with the same masked tensor then the cost of masking it can be amortised away.</p>
<p>But with the current setup we mask every time we compute e.g. <code class="language-plaintext highlighter-rouge">dot</code> or <code class="language-plaintext highlighter-rouge">mul</code> since masking is baked into these. So to avoid this we simply make masking an explicit operation, additionally allowing us to also use the same masked version across different operations.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">mask</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span>
<span class="n">node_key</span> <span class="o">=</span> <span class="p">(</span><span class="s">'mask'</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
<span class="n">masked</span> <span class="o">=</span> <span class="n">nodes</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">node_key</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="k">if</span> <span class="n">masked</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">x0</span><span class="p">,</span> <span class="n">x1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="n">shape</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">shape</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">name_scope</span><span class="p">(</span><span class="s">'mask'</span><span class="p">):</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">CRYPTO_PRODUCER</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">sample</span><span class="p">(</span><span class="n">shape</span><span class="p">)</span>
<span class="n">a0</span><span class="p">,</span> <span class="n">a1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">alpha0</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">a0</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">alpha1</span> <span class="o">=</span> <span class="n">crt_sub</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">a1</span><span class="p">)</span>
<span class="c1"># exchange of alphas
</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">alpha_on_0</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">alpha0</span><span class="p">,</span> <span class="n">alpha1</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">alpha_on_1</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">alpha0</span><span class="p">,</span> <span class="n">alpha1</span><span class="p">)</span>
<span class="n">masked</span> <span class="o">=</span> <span class="n">MaskedPrivateTensor</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">a1</span><span class="p">,</span> <span class="n">alpha_on_0</span><span class="p">,</span> <span class="n">alpha_on_1</span><span class="p">)</span>
<span class="n">nodes</span><span class="p">[</span><span class="n">node_key</span><span class="p">]</span> <span class="o">=</span> <span class="n">masked</span>
<span class="k">return</span> <span class="n">masked</span>
</code></pre></div></div>
<p>Note that we introduce a <code class="language-plaintext highlighter-rouge">MaskedPrivateTensor</code> class as part of this, which is again simply a convenient way of abstracting over the five lists of tensors we get from <code class="language-plaintext highlighter-rouge">mask(x)</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">MaskedPrivateTensor</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">a1</span><span class="p">,</span> <span class="n">alpha_on_0</span><span class="p">,</span> <span class="n">alpha_on_1</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">a</span> <span class="o">=</span> <span class="n">a</span>
<span class="bp">self</span><span class="o">.</span><span class="n">a0</span> <span class="o">=</span> <span class="n">a0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">a1</span> <span class="o">=</span> <span class="n">a1</span>
<span class="bp">self</span><span class="o">.</span><span class="n">alpha_on_0</span> <span class="o">=</span> <span class="n">alpha_on_0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">alpha_on_1</span> <span class="o">=</span> <span class="n">alpha_on_1</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">shape</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">a</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">shape</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">unwrapped</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">a</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">a0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">a1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">alpha_on_0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">alpha_on_1</span>
</code></pre></div></div>
<p>With this we may rewrite <code class="language-plaintext highlighter-rouge">dot</code> as below, which is now only responsible for the recombination step.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span> <span class="ow">or</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">MaskedPrivateTensor</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">)</span> <span class="ow">or</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">MaskedPrivateTensor</span><span class="p">)</span>
<span class="n">node_key</span> <span class="o">=</span> <span class="p">(</span><span class="s">'dot'</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">nodes</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">node_key</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span>
<span class="k">if</span> <span class="n">z</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">):</span> <span class="n">x</span> <span class="o">=</span> <span class="n">mask</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">PrivateTensor</span><span class="p">):</span> <span class="n">y</span> <span class="o">=</span> <span class="n">mask</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
<span class="n">a</span><span class="p">,</span> <span class="n">a0</span><span class="p">,</span> <span class="n">a1</span><span class="p">,</span> <span class="n">alpha_on_0</span><span class="p">,</span> <span class="n">alpha_on_1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="n">b</span><span class="p">,</span> <span class="n">b0</span><span class="p">,</span> <span class="n">b1</span><span class="p">,</span> <span class="n">beta_on_0</span><span class="p">,</span> <span class="n">beta_on_1</span> <span class="o">=</span> <span class="n">y</span><span class="o">.</span><span class="n">unwrapped</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">name_scope</span><span class="p">(</span><span class="s">'dot'</span><span class="p">):</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">CRYPTO_PRODUCER</span><span class="p">):</span>
<span class="n">ab</span> <span class="o">=</span> <span class="n">crt_dot</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="n">ab0</span><span class="p">,</span> <span class="n">ab1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">ab</span><span class="p">)</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_0</span><span class="p">):</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="n">alpha_on_0</span>
<span class="n">beta</span> <span class="o">=</span> <span class="n">beta_on_0</span>
<span class="n">z0</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">ab0</span><span class="p">,</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">a0</span><span class="p">,</span> <span class="n">beta</span><span class="p">),</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">b0</span><span class="p">),</span>
<span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">beta</span><span class="p">))))</span>
<span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">SERVER_1</span><span class="p">):</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="n">alpha_on_1</span>
<span class="n">beta</span> <span class="o">=</span> <span class="n">beta_on_1</span>
<span class="n">z1</span> <span class="o">=</span> <span class="n">crt_add</span><span class="p">(</span><span class="n">ab1</span><span class="p">,</span>
<span class="n">crt_add</span><span class="p">(</span><span class="n">crt_dot</span><span class="p">(</span><span class="n">a1</span><span class="p">,</span> <span class="n">beta</span><span class="p">),</span>
<span class="n">crt_dot</span><span class="p">(</span><span class="n">alpha</span><span class="p">,</span> <span class="n">b1</span><span class="p">)))</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">z0</span><span class="p">,</span> <span class="n">z1</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">truncate</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="n">nodes</span><span class="p">[</span><span class="n">node_key</span><span class="p">]</span> <span class="o">=</span> <span class="n">z</span>
<span class="k">return</span> <span class="n">z</span>
</code></pre></div></div>
<p>As a verification we can see that TensorBoard shows us the expected graph structure, in this case inside the graph for <a href="/2017/04/17/private-deep-learning-with-mpc/#approximating-sigmoid"><code class="language-plaintext highlighter-rouge">sigmoid</code></a>.</p>
<p><img src="/assets/tensorspdz/masking-reuse.png" alt="" /></p>
<p>Here the value of <code class="language-plaintext highlighter-rouge">square(x)</code> is first computed, then masked, and finally reused in four multiplications.</p>
<p>There is an inefficiency though: while the <a href="https://arxiv.org/abs/1603.04467">dataflow nature</a> of TensorFlow will in general take care of only recomputing the parts of the graph that have changed between two executions, this does not apply to operations involving sampling via e.g. <a href="https://www.tensorflow.org/api_docs/python/tf/random_uniform"><code class="language-plaintext highlighter-rouge">tf.random_uniform</code></a>, which is used in our sharing and masking. Consequently, masks are not being reused across executions.</p>
<h2 id="caching-values">Caching values</h2>
<p>To get around the above issue we can introduce caching of values that survive across different executions of the graph, and an easy way of doing this is to store tensors in <a href="https://www.tensorflow.org/api_docs/python/tf/Variable">variables</a>. Normal executions will read from these, while an explicit <code class="language-plaintext highlighter-rouge">cache_populators</code> set of operations allow us to populated them.</p>
<p>For example, wrapping our two tensors <code class="language-plaintext highlighter-rouge">w</code> and <code class="language-plaintext highlighter-rouge">b</code> with such <code class="language-plaintext highlighter-rouge">cache</code> operation gets us the following graph.</p>
<p><img src="/assets/tensorspdz/cached.png" alt="" /></p>
<p>When executing the cache population operations TensorFlow automatically figures out which subparts of the graph it needs to execute to generate the values to be cached, and which can be ignored.</p>
<p><img src="/assets/tensorspdz/cached-populate.png" alt="" /></p>
<p>And likewise when predicting, in this case skipping sharing and masking.</p>
<p><img src="/assets/tensorspdz/cached-predict.png" alt="" /></p>
<h2 id="buffering-triples">Buffering triples</h2>
<p>Recall that a main purpose of <a href="/2017/09/03/the-spdz-protocol-part1/#multiplication">triples</a> is to move the computation of the crypto producer to an <em>offline phase</em> and distribute its results to the two servers ahead of time in order to speed up their computation later during the <em>online phase</em>.</p>
<p>So far we havenâ€™t done anything to specify that this should happen though, and from reading the above code itâ€™s not unreasonable to assume that the crypto producer will instead compute in synchronisation with the two servers, injecting idle waiting periods throughout their computation. However, from experiments it seems that TensorFlow is already smart enough to optimise the graph to do the right thing and batch triple distribution, presumably to save on networking. We still have an initial waiting period though, that we could get rid of by introducing a separate compute-and-distribute execution that fills up buffers.</p>
<p><img src="/assets/tensorspdz/tracing.png" alt="" /></p>
<p>Weâ€™ll skip this issue for now and instead return to it when looking at private training since it is not unreasonable to expect significant performance improvements there from distributing the training data ahead of time.</p>
<h1 id="profiling">Profiling</h1>
<p>As a final reason to be excited about building dataflow programs in TensorFlow we also look at the built-in <a href="https://www.tensorflow.org/programmers_guide/graph_viz#runtime_statistics">runtime statistics</a>. We have already seen the built-in detailed tracing support above, but in TensorBoard we can also easily see how expensive each operation was both in terms of compute and memory. The numbers reported here are from the <a href="#experiments">experiments</a> below.</p>
<p><img src="/assets/tensorspdz/computetime.png" alt="" /></p>
<p>The heatmap above e.g. shows that <code class="language-plaintext highlighter-rouge">sigmoid</code> was the most expensive operation in the run and that the dot product took roughly 30ms to execute. Moreover, in the below figure we have navigated further into the dot block and see that sharing in this particular run taking about 3ms.</p>
<p><img src="/assets/tensorspdz/computetime-detailed.png" alt="" /></p>
<p>This way we can potentially identify bottlenecks and compare performance of different approaches. And if needed we can of course switch to tracing for even more details.</p>
<h1 id="experiments">Experiments</h1>
<p>The <a href="https://github.com/mortendahl/privateml/tree/master/tensorflow/spdz/">GitHub repository</a> contains the code needed for experimentation, including examples and instructions for setting up either a <a href="https://github.com/mortendahl/privateml/tree/master/tensorflow/spdz/configs/localhost">local configuration</a> or a <a href="https://github.com/mortendahl/privateml/tree/master/tensorflow/spdz/configs/gcp">GCP configuration</a> of hosts. For the running example of private prediciton using a logistic regression model we use the GCP configuration, i.e. the parties are running on different virtual hosts located in the same Google Cloud zone, here on some of the weaker instances, namely dual core and 10GB memory.</p>
<p>A slightly simplified version of our program is as follows, where we first train a model in public, build a graph for the private prediction computation, and then run it in a fresh session. The model was somewhat arbitrarily picked to have 100 features.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">config</span> <span class="kn">import</span> <span class="n">session</span>
<span class="kn">from</span> <span class="nn">tensorspdz</span> <span class="kn">import</span> <span class="p">(</span>
<span class="n">define_input</span><span class="p">,</span> <span class="n">define_variable</span><span class="p">,</span>
<span class="n">add</span><span class="p">,</span> <span class="n">dot</span><span class="p">,</span> <span class="n">sigmoid</span><span class="p">,</span> <span class="n">cache</span><span class="p">,</span> <span class="n">mask</span><span class="p">,</span>
<span class="n">encode_input</span><span class="p">,</span> <span class="n">decode_output</span>
<span class="p">)</span>
<span class="c1"># publicly train `weights` and `bias`
</span><span class="n">weights</span><span class="p">,</span> <span class="n">bias</span> <span class="o">=</span> <span class="n">train_publicly</span><span class="p">()</span>
<span class="c1"># define shape of unknown input
</span><span class="n">shape_x</span> <span class="o">=</span> <span class="n">X</span><span class="o">.</span><span class="n">shape</span>
<span class="c1"># construct graph for private prediction
</span><span class="n">input_x</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="n">define_input</span><span class="p">(</span><span class="n">shape_x</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">'x'</span><span class="p">)</span>
<span class="n">init_w</span><span class="p">,</span> <span class="n">w</span> <span class="o">=</span> <span class="n">define_variable</span><span class="p">(</span><span class="n">weights</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">'w'</span><span class="p">)</span>
<span class="n">init_b</span><span class="p">,</span> <span class="n">b</span> <span class="o">=</span> <span class="n">define_variable</span><span class="p">(</span><span class="n">bias</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">'b'</span><span class="p">)</span>
<span class="k">if</span> <span class="n">use_caching</span><span class="p">:</span>
<span class="n">w</span> <span class="o">=</span> <span class="n">cache</span><span class="p">(</span><span class="n">mask</span><span class="p">(</span><span class="n">w</span><span class="p">))</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">cache</span><span class="p">(</span><span class="n">b</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">sigmoid</span><span class="p">(</span><span class="n">add</span><span class="p">(</span><span class="n">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">w</span><span class="p">),</span> <span class="n">b</span><span class="p">))</span>
<span class="c1"># start session between all players
</span><span class="k">with</span> <span class="n">session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
<span class="c1"># share and distribute `weights` and `bias` to the two servers
</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">([</span><span class="n">init_w</span><span class="p">,</span> <span class="n">init_b</span><span class="p">])</span>
<span class="k">if</span> <span class="n">use_caching</span><span class="p">:</span>
<span class="c1"># compute and store cached values
</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">cache_populators</span><span class="p">)</span>
<span class="c1"># prepare to use `X` as private input for prediction
</span> <span class="n">feed_dict</span> <span class="o">=</span> <span class="n">encode_input</span><span class="p">([</span>
<span class="p">(</span><span class="n">input_x</span><span class="p">,</span> <span class="n">X</span><span class="p">)</span>
<span class="p">])</span>
<span class="c1"># run secure computation and reveal output
</span> <span class="n">y_pred</span> <span class="o">=</span> <span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">reveal</span><span class="p">(</span><span class="n">y</span><span class="p">),</span> <span class="n">feed_dict</span><span class="o">=</span><span class="n">feed_dict</span><span class="p">)</span>
<span class="k">print</span> <span class="n">decode_output</span><span class="p">(</span><span class="n">y_pred</span><span class="p">)</span>
</code></pre></div></div>
<p>Running this a few times with different sizes of <code class="language-plaintext highlighter-rouge">X</code> gives the timings below, where the entire computation is considered including triple generation and distribution; slightly surprisingly there were no real difference between caching masked values or not.</p>
<center>
<img width="80%" height="80%" src="/assets/tensorspdz/timings-10000.png" />
</center>
<p>Processing batches of size 1, 10, and 100 took roughly the same time, ~60ms on average, which might suggest a lower latency bound due to networking. At 1000 the time jumps to ~110ms, at 10,000 to ~600ms, and finally at 100,000 to ~5s. As such, if latency is important than we can perform ~1600 predictions per second, while if more flexible then at least ~20,000 per second.</p>
<center>
<img width="80%" height="80%" src="/assets/tensorspdz/timings-100000.png" />
</center>
<p>This however is measuring only timings reported by profiling, with actual execution time taking a bit longer; hopefully some of the production-oriented tools such as <a href="https://www.tensorflow.org/serving/"><code class="language-plaintext highlighter-rouge">tf.serving</code></a> that come with TensorFlow can improve on this.</p>
<h1 id="thoughts">Thoughts</h1>
<p>After private prediction itâ€™ll of course also be interesting to look at private training. Caching of masked training data might be more relevant here since it remains fixed throughout the process.</p>
<p>The serving of models can also be improved, using for instance the production-ready <a href="https://www.tensorflow.org/serving/"><code class="language-plaintext highlighter-rouge">tf.serving</code></a> one might be able to avoid much of the current initial overhead for orchestration, as well as having endpoints that can be <a href="https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md">safely exposed</a> to the public.</p>
<p>Finally, there are security improvements to be made on e.g. communication between the five parties. In particular, in the current version of TensorFlow all communication is happening over unencrypted and unauthenticated <a href="https://grpc.io/">gRPC</a> connections, which means that someone listening in on the network traffic in principle could learn all private values. Since support for <a href="https://grpc.io/docs/guides/auth.html">TLS</a> is already there in gRPC it might be straight-forward to make use of it in TensorFlow without a significant impact on performance. Likewise, TensorFlow does not currently use a strong pseudo-random generator for <a href="https://www.tensorflow.org/api_docs/python/tf/random_uniform"><code class="language-plaintext highlighter-rouge">tf.random_uniform</code></a> and hence sharing and masking are not as secure as they could be; adding an operation for cryptographically strong randomness might be straight-forward and should give roughly the same performance.</p>
<!--
# Dump
- https://learningtensorflow.com/lesson11/
- [XLA](https://www.tensorflow.org/performance/xla/)
TODO overhead compared to plain (maybe wait until training?)
https://www.tensorflow.org/programmers_guide/graphs
https://en.wikipedia.org/wiki/Dataflow_programming
https://github.com/ppwwyyxx/tensorpack
https://github.com/tensorflow/serving/issues/193
https://github.com/sandtable/ssl_grpc_example
[TensorBoard](https://www.tensorflow.org/programmers_guide/graph_viz)
-->Morten DahlTL;DR: using TensorFlow as a distributed computation framework for dataflow programs we give a full implementation of the SPDZ protocol with networking, in turn enabling optimised machine learning on encrypted data.Private Image Analysis with MPC2017-09-19T12:00:00+00:002017-09-19T12:00:00+00:00https://mortendahl.github.io/2017/09/19/private-image-analysis-with-mpc<p><em><strong>TL;DR:</strong> we take a typical CNN deep learning model and go through a series of steps that enable both training and prediction to instead be done on encrypted data.</em></p>
<p>Using deep learning to analyse images through <a href="http://cs231n.github.io/">convolutional neural networks</a> (CNNs) has gained enormous popularity over the last few years due to their success in out-performing many other approaches on this and related tasks.</p>
<p>One recent application took the form of <a href="http://www.nature.com/nature/journal/v542/n7639/full/nature21056.html">skin cancer detection</a>, where anyone can quickly take a photo of a skin lesion using a mobile phone app and have it analysed with â€śperformance on par with [..] expertsâ€ť (see the <a href="https://www.youtube.com/watch?v=toK1OSLep3s">associated video</a> for a demo). Having access to a large set of clinical photos played a key part in training this model â€“ a data set that could be considered sensitive.</p>
<p>Which brings us to privacy and eventually <a href="https://en.wikipedia.org/wiki/Secure_multi-party_computation">secure multi-party computation</a> (MPC): how many applications are limited today due to the lack of access to data? In the above case, could the model be improved by letting anyone with a mobile phone app contribute to the training data set? And if so, how many would volunteer given the risk of exposing personal health related information?</p>
<p>With MPC we can potentially lower the risk of exposure and hence increase the incentive to participate. More concretely, by instead performing the training on encrypted data we can prevent anyone from ever seeing not only individual data, but also the learned model parameters. Further techniques such as <a href="https://en.wikipedia.org/wiki/Differential_privacy">differential privacy</a> could additionally be used to hide any leakage from predictions as well, but we wonâ€™t go into that here.</p>
<p>In this blog post weâ€™ll look at a simpler use case for image analysis but go over all required techniques. A few notebooks are presented along the way, with the main one given as part of the <a href="#proof-of-concept-implementation">proof of concept implementation</a>.</p>
<p><a href="https://github.com/mortendahl/talks/raw/master/ParisML17.pdf">Slides</a> from a more recent presentation at the <a href="http://mlparis.org/">Paris Machine Learning meetup</a> are now also available.</p>
<p><em>A big thank you goes out to <a href="https://twitter.com/iamtrask">Andrew Trask</a>, <a href="https://twitter.com/smartcryptology">Nigel Smart</a>, <a href="https://twitter.com/adria">AdriĂ GascĂłn</a>, and the <a href="https://twitter.com/openminedorg">OpenMined community</a> for inspiration and interesting discussions on this topic! <a href="https://weakish.github.io/">Jakukyo Friel</a> has also very kindly made a <a href="https://www.jqr.com/article/000113">Chinese translation</a>.</em></p>
<h1 id="setting">Setting</h1>
<p>We will assume that the training data set is jointly held by a set of <em>input providers</em> and that the training is performed by two distinct <em>servers</em> (or <em>parties</em>) that are trusted not to collaborate beyond what our protocol specifies. In practice, these servers could for instance be virtual instances in a shared cloud environment operated by two different organisations.</p>
<p>The input providers are only needed in the very beginning to transmit their (encrypted) training data; after that all computations involve only the two servers, meaning it is indeed plausible for the input providers to use e.g. mobile phones. Once trained, the model will remain jointly held in encrypted form by the two servers where anyone can use it to make further encrypted predictions.</p>
<p>For technical reasons we also assume a distinct <em>crypto producer</em> that generates certain raw material used during the computation for increased efficiency; there are ways to eliminate this additional entity but we wonâ€™t go into that here.</p>
<p>Finally, in terms of security we aim for a typical notion used in practice, namely <em>honest-but-curious (or passive) security</em>, where the servers are assumed to follow the protocol but may otherwise try to learn as much possible from the data they see. While a slightly weaker notion than <em>fully malicious (or active) security</em> with respect to the servers, this still gives strong protection against anyone who may compromise one of the servers <em>after</em> the computations, despite what they do. Note that for the purpose of this blog post we will actually allow a small privacy leakage during training as detailed later.</p>
<h1 id="image-analysis-with-cnns">Image Analysis with CNNs</h1>
<p>Our use case is the canonical <a href="https://www.tensorflow.org/get_started/mnist/beginners">MNIST handwritten digit recognition</a>, namely learning to identify the Arabic numeral in a given image, and we will use the following CNN model from a <a href="https://github.com/fchollet/keras/blob/master/examples/mnist_transfer_cnn.py">Keras example</a> as our base.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">feature_layers</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">Conv2D</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">padding</span><span class="o">=</span><span class="s">'same'</span><span class="p">,</span> <span class="n">input_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">)),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">Conv2D</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">padding</span><span class="o">=</span><span class="s">'same'</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">MaxPooling2D</span><span class="p">(</span><span class="n">pool_size</span><span class="o">=</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">)),</span>
<span class="n">Dropout</span><span class="p">(</span><span class="mf">.25</span><span class="p">),</span>
<span class="n">Flatten</span><span class="p">()</span>
<span class="p">]</span>
<span class="n">classification_layers</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">Dense</span><span class="p">(</span><span class="mi">128</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">Dropout</span><span class="p">(</span><span class="mf">.50</span><span class="p">),</span>
<span class="n">Dense</span><span class="p">(</span><span class="n">NUM_CLASSES</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'softmax'</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">Sequential</span><span class="p">(</span><span class="n">feature_layers</span> <span class="o">+</span> <span class="n">classification_layers</span><span class="p">)</span>
<span class="n">model</span><span class="o">.</span><span class="nb">compile</span><span class="p">(</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="s">'adam'</span><span class="p">,</span>
<span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s">'accuracy'</span><span class="p">])</span>
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span>
<span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span>
<span class="n">epochs</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span>
<span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">validation_data</span><span class="o">=</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">))</span>
</code></pre></div></div>
<p>We wonâ€™t go into the details of this model here since the principles are already <a href="http://cs231n.stanford.edu/">well-covered</a> <a href="https://github.com/ageron/handson-ml">elsewhere</a>, but the basic idea is to first run an image through a set of <em>feature layers</em> that transforms the raw pixels of the input image into abstract properties that are more relevant for our classification task. These properties are then subsequently combined by a set of <em>classification layers</em> to yield a probability distribution over the possible digits. The final outcome is then typically simply the digit with highest assigned probability.</p>
<p>As we shall see, using <a href="https://keras.io/">Keras</a> has the benefit that we can perform quick experiments on unencrypted data to get an idea of the performance of the model itself, as well as providing a simple interface to later mimic in the encrypted setting.</p>
<h1 id="secure-computation-with-spdz">Secure Computation with SPDZ</h1>
<p>With CNNs in place we next turn to MPC. For this we will use the state-of-the-art SPDZ protocol as it allows us to only have two servers and to improve <em>online</em> performance by moving certain computations to an <em>offline</em> phase as described in detail in earlier <a href="/2017/09/03/the-spdz-protocol-part1">blog</a> <a href="/2017/09/10/the-spdz-protocol-part2">posts</a>.</p>
<p>As typical in secure computation protocols, all computations take place in a field, here identified by a prime <code class="language-plaintext highlighter-rouge">Q</code>. This means we need to <a href="/2017/09/03/the-spdz-protocol-part1#fixed-point-encoding">encode</a> the floating-point numbers used by the CNNs as integers modulo a prime, which puts certain constraints on <code class="language-plaintext highlighter-rouge">Q</code> and in turn has an affect on performance.</p>
<p>Moreover, <a href="/2017/09/10/the-spdz-protocol-part2">recall</a> that in interactive computations such as the SPDZ protocol it becomes relevant to also consider communication and round complexity, in addition to the typical time complexity. Here, the former measures the number of bits sent across the network, which is a relatively slow process, and the latter the number of synchronisation points needed between the two servers, which may block one of them with nothing to do until the other catches up. Both hence also have a big impact on overall executing time.</p>
<p>Most importantly however, is that the only â€śnativeâ€ť operations we have in these protocols is addition and multiplication. Division, comparison, etc. can be done, but are more expensive in terms of our three performance measures. Later we shall see how to mitigate some of the issues raised due to this, but here we first recall the basic SPDZ protocol.</p>
<h2 id="tensor-operations">Tensor operations</h2>
<p>When we introduced the SPDZ protocol <a href="/2017/09/03/the-spdz-protocol-part1">earlier</a> we did so in the form of classes <code class="language-plaintext highlighter-rouge">PublicValue</code> and <code class="language-plaintext highlighter-rouge">PrivateValue</code> representing respectively a (scalar) value known in clear by both servers and an encrypted value known only in secret shared form. In this blog post, we now instead present it more naturally via classes <code class="language-plaintext highlighter-rouge">PublicTensor</code> and <code class="language-plaintext highlighter-rouge">PrivateTensor</code> that reflect the heavy use of <a href="https://www.tensorflow.org/programmers_guide/tensors">tensors</a> in our deep learning setting.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateTensor</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">values</span><span class="p">,</span> <span class="n">shares0</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">shares1</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">values</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">values</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">shares0</span> <span class="o">=</span> <span class="n">shares0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">shares1</span> <span class="o">=</span> <span class="n">shares1</span>
<span class="k">def</span> <span class="nf">reconstruct</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PublicTensor</span><span class="p">(</span><span class="n">reconstruct</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">shares0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">shares1</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicTensor</span><span class="p">:</span>
<span class="n">shares0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">values</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">shares0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">shares1</span> <span class="o">=</span> <span class="n">y</span><span class="o">.</span><span class="n">shares1</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateTensor</span><span class="p">:</span>
<span class="n">shares0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shares0</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">shares0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">shares1</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shares1</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">shares1</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">mul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicTensor</span><span class="p">:</span>
<span class="n">shares0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shares0</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">values</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">shares1</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shares1</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">values</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateTensor</span><span class="p">:</span>
<span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">a_mul_b</span> <span class="o">=</span> <span class="n">generate_mul_triple</span><span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">y</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">a</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="n">beta</span> <span class="o">=</span> <span class="p">(</span><span class="n">y</span> <span class="o">-</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="k">return</span> <span class="n">alpha</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">alpha</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">b</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a_mul_b</span>
</code></pre></div></div>
<p>As seen, the adaptation is pretty straightforward using NumPy and the general form of for instance <code class="language-plaintext highlighter-rouge">PrivateTensor</code> is almost exactly the same, only occationally passing a shape around as well. There are a few technical details however, all of which are available in full in <a href="https://github.com/mortendahl/privateml/blob/master/spdz/Tensor%20SPDZ.ipynb">the associated notebook</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">share</span><span class="p">(</span><span class="n">secrets</span><span class="p">):</span>
<span class="n">shares0</span> <span class="o">=</span> <span class="n">sample_random_tensor</span><span class="p">(</span><span class="n">secrets</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">shares1</span> <span class="o">=</span> <span class="p">(</span><span class="n">secrets</span> <span class="o">-</span> <span class="n">shares0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span>
<span class="k">def</span> <span class="nf">reconstruct</span><span class="p">(</span><span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span><span class="p">):</span>
<span class="n">secrets</span> <span class="o">=</span> <span class="p">(</span><span class="n">shares0</span> <span class="o">+</span> <span class="n">shares1</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">secrets</span>
<span class="k">def</span> <span class="nf">generate_mul_triple</span><span class="p">(</span><span class="n">x_shape</span><span class="p">,</span> <span class="n">y_shape</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">sample_random_tensor</span><span class="p">(</span><span class="n">x_shape</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">sample_random_tensor</span><span class="p">(</span><span class="n">y_shape</span><span class="p">)</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">multiply</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">b</span><span class="p">),</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</code></pre></div></div>
<p>As such, perhaps the biggest difference is in the above base utility methods where this shape is used.</p>
<h1 id="adapting-the-model">Adapting the Model</h1>
<p>While it is in principle possible to compute any function securely with what we already have, and hence also the base model from above, in practice it is relevant to first consider variants of the model that are more MPC friendly, and vice versa. In slightly more picturesque words, it is common to open up our two black boxes and adapt the two technologies to better fit each other.</p>
<p>The root of this comes from some operations being surprisingly expensive in the encrypted setting. We saw above that addition and multiplication are relatively cheap, yet comparison and division with private denominator are not. For this reason we make a few changes to the model to avoid these.</p>
<p>The various changes presented in this section as well as their simulation performances are available in full in the <a href="https://github.com/mortendahl/privateml/blob/master/image-analysis/Keras.ipynb">associated Python notebook</a>.</p>
<h2 id="optimizer">Optimizer</h2>
<p>The first issue involves the optimizer: while <a href="http://ruder.io/optimizing-gradient-descent/index.html#adam"><em>Adam</em></a> is a preferred choice in many implementations for its efficiency, it also involves taking a square root of a private value and using one as the denominator in a division. While it is theoretically possible to <a href="https://eprint.iacr.org/2012/164">compute these securely</a>, in practice it could be a significant bottleneck for performance and hence relevant to avoid.</p>
<p>A simple remedy is to switch to the <a href="http://ruder.io/optimizing-gradient-descent/index.html#momentum"><em>momentum SGD</em></a> optimizer, which may imply longer training time but only uses simple operations.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span><span class="o">.</span><span class="nb">compile</span><span class="p">(</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="n">SGD</span><span class="p">(</span><span class="n">clipnorm</span><span class="o">=</span><span class="mi">10000</span><span class="p">,</span> <span class="n">clipvalue</span><span class="o">=</span><span class="mi">10000</span><span class="p">),</span>
<span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s">'accuracy'</span><span class="p">])</span>
</code></pre></div></div>
<p>An additional caveat is that many optimizers use <a href="http://nmarkou.blogspot.fr/2017/07/deep-learning-why-you-should-use.html">clipping</a> to prevent gradients from growing too small or too large. This requires a <a href="https://www1.cs.fau.de/filepool/publications/octavian_securescm/smcint-scn10.pdf">comparison on private values</a>, which again is a somewhat expensive operation in the encrypted setting, and as a result we aim to avoid using this technique altogether. To get realistic results from our Keras simulation we increase the bounds as seen above.</p>
<h2 id="layers">Layers</h2>
<p>Speaking of comparisons, the <em>ReLU</em> and max-pooling layers poses similar problems. In <a href="https://www.microsoft.com/en-us/research/publication/cryptonets-applying-neural-networks-to-encrypted-data-with-high-throughput-and-accuracy/">CryptoNets</a> the former is replaced by a squaring function and the latter by average pooling, while <a href="https://eprint.iacr.org/2017/396">SecureML</a> implements a ReLU-like activation function by adding complexity that we wish to avoid to keep things simple. As such, we here instead use higher-degree sigmoid activation functions and average-pooling layers. Note that average-pooling also uses a division, yet this time the denominator is a public value, and hence division is simply a public inversion followed by a multiplication.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">feature_layers</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">Conv2D</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">padding</span><span class="o">=</span><span class="s">'same'</span><span class="p">,</span> <span class="n">input_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">28</span><span class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span class="mi">1</span><span class="p">)),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'sigmoid'</span><span class="p">),</span>
<span class="n">Conv2D</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">),</span> <span class="n">padding</span><span class="o">=</span><span class="s">'same'</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'sigmoid'</span><span class="p">),</span>
<span class="n">AveragePooling2D</span><span class="p">(</span><span class="n">pool_size</span><span class="o">=</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">)),</span>
<span class="n">Dropout</span><span class="p">(</span><span class="mf">.25</span><span class="p">),</span>
<span class="n">Flatten</span><span class="p">()</span>
<span class="p">]</span>
<span class="n">classification_layers</span> <span class="o">=</span> <span class="p">[</span>
<span class="n">Dense</span><span class="p">(</span><span class="mi">128</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'sigmoid'</span><span class="p">),</span>
<span class="n">Dropout</span><span class="p">(</span><span class="mf">.50</span><span class="p">),</span>
<span class="n">Dense</span><span class="p">(</span><span class="n">NUM_CLASSES</span><span class="p">),</span>
<span class="n">Activation</span><span class="p">(</span><span class="s">'softmax'</span><span class="p">)</span>
<span class="p">]</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">Sequential</span><span class="p">(</span><span class="n">feature_layers</span> <span class="o">+</span> <span class="n">classification_layers</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="https://github.com/mortendahl/privateml/blob/master/image-analysis/Keras.ipynb">Simulations</a> indicate that with this change we now have to bump the number of epochs, slowing down training time by an equal factor. Other choices of learning rate or momentum may improve this.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span>
<span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span>
<span class="n">epochs</span><span class="o">=</span><span class="mi">15</span><span class="p">,</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span>
<span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">validation_data</span><span class="o">=</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">))</span>
</code></pre></div></div>
<p>The remaining layers are easily dealt with. Dropout and flatten do not care about whether weâ€™re in an encrypted or unencrypted setting, and dense and convolution are matrix dot products which only require basic operations.</p>
<h2 id="softmax-and-loss-function">Softmax and loss function</h2>
<p>The final <em>softmax</em> layer also causes complications for training in the encrypted setting as we need to compute both an <a href="https://cs.umd.edu/~fenghao/paper/modexp.pdf">exponentiation using a private exponent</a> as well as normalisation in the form of a division with a private denominator.</p>
<p>While both remain possible we here choose a much simpler approach and allow the predicted class likelihoods for each training sample to be revealed to one of the servers, who can then compute the result from the revealed values. This of course results in a privacy leakage that may or may not pose an acceptable risk.</p>
<p>One heuristic improvement is for the servers to first permute the vector of class likelihoods for each training sample before revealing anything, thereby hiding which likelihood corresponds to which class. However, this may be of little effect if e.g. â€śhealthyâ€ť often means a narrow distribution over classes while â€śsickâ€ť means a spread distribution.</p>
<p>Another is to introduce a dedicated third server who only does this small computation, doesnâ€™t see anything else from the training data, and hence cannot relate the labels with the sample data. Something is still leaked though, and this quantity is hard to reason about.</p>
<p>Finally, we could also replace this <a href="https://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-rest">one-vs-all</a> approach with an <a href="https://en.wikipedia.org/wiki/Multiclass_classification#One-vs.-one">one-vs-one</a> approach using e.g. sigmoids. As argued earlier this allows us to fully compute the predictions without decrypting. We still need to compute the loss however, which could be done by also considering a different loss function.</p>
<p>Note that none of the issues mentioned here occur when later performing predictions using the trained network, as there is no loss to be computed and the servers can there simply skip the softmax layer and let the recipient of the prediction compute it himself on the revealed values: for him itâ€™s simply a question of how the values are interpreted.</p>
<h2 id="transfer-learning">Transfer Learning</h2>
<p>At this point <a href="https://github.com/mortendahl/privateml/blob/master/image-analysis/Keras.ipynb">it seems</a> that we can actually train the model as-is and get decent results. But as often done in CNNs we can get significant speed-ups by employing <a href="http://cs231n.github.io/transfer-learning/">transfer</a> <a href="http://ruder.io/transfer-learning/">learning</a>; in fact, it is somewhat <a href="https://yashk2810.github.io/Transfer-Learning/">well-known</a> that â€śvery few people train their own convolutional net from scratch because they donâ€™t have sufficient dataâ€ť and that â€śit is always recommended to use transfer learning in practiceâ€ť.</p>
<p>A particular application to our setting here is that training may be split into a pre-training phase using non-sensitive public data and a fine-tuning phase using sensitive private data. For instance, in the case of a skin cancer detector, the researchers may choose to pre-train on a public set of photos and then afterwards ask volunteers to improve the model by providing additional photos.</p>
<p>Moreover, besides a difference in cardinality, there is also room for differences in the two data sets in terms of subjects, as CNNs have a tendency to first decompose these into meaningful subcomponents, the recognition of which is what is being transferred. In other words, the technique is strong enough for pre-training to happen on a different type of images than fine-tuning.</p>
<p>Returning to our concrete use-case of character recognition, we will let the â€śpublicâ€ť images be those of digits <code class="language-plaintext highlighter-rouge">0-4</code> and the â€śprivateâ€ť images be those of digits <code class="language-plaintext highlighter-rouge">5-9</code>. As an alternative, it doesnâ€™t seem unreasonable to instead have used for instance characters <code class="language-plaintext highlighter-rouge">a-z</code> as the former and digits <code class="language-plaintext highlighter-rouge">0-9</code> as the latter.</p>
<h3 id="pre-train-on-public-dataset">Pre-train on public dataset</h3>
<p>In addition to avoiding the overhead of training on encrypted data for the public dataset, we also benefit from being able to train with more advanced optimizers. Here for instance, we switch back to the <code class="language-plaintext highlighter-rouge">Adam</code> optimizer for the public images and can take advantage of its improved training time. In particular, we see that we can again lower the number of epochs needed.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">),</span> <span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span> <span class="o">=</span> <span class="n">public_dataset</span>
<span class="n">model</span><span class="o">.</span><span class="nb">compile</span><span class="p">(</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="s">'adam'</span><span class="p">,</span>
<span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s">'accuracy'</span><span class="p">])</span>
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span>
<span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span>
<span class="n">epochs</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span>
<span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">validation_data</span><span class="o">=</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">))</span>
</code></pre></div></div>
<p>Once happy with this the servers simply shares the model parameters and move on to training on the private dataset.</p>
<h3 id="fine-tune-on-private-dataset">Fine-tune on private dataset</h3>
<p>While we now begin encrypted training on model parameters that are already â€śhalf-way thereâ€ť and hence can be expected to require fewer epochs, another benefit of transfer learning, as mentioned above, is that recognition of subcomponents tend to happen in the lower layers of the network and may in some cases be used as-is. As a result, we now freeze the parameters of the feature layers and focus training efforts exclusively on the classification layers.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">layer</span> <span class="ow">in</span> <span class="n">feature_layers</span><span class="p">:</span>
<span class="n">layer</span><span class="o">.</span><span class="n">trainable</span> <span class="o">=</span> <span class="bp">False</span>
</code></pre></div></div>
<p>Note however that we still need to run all private training samples forward through these layers; the only difference is that we skip them in the backward step and that there are few parameters to train.</p>
<p>Training is then performed as before, although now using a lower learning rate.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">),</span> <span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">)</span> <span class="o">=</span> <span class="n">private_dataset</span>
<span class="n">model</span><span class="o">.</span><span class="nb">compile</span><span class="p">(</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'categorical_crossentropy'</span><span class="p">,</span>
<span class="n">optimizer</span><span class="o">=</span><span class="n">SGD</span><span class="p">(</span><span class="n">clipnorm</span><span class="o">=</span><span class="mi">10000</span><span class="p">,</span> <span class="n">clipvalue</span><span class="o">=</span><span class="mi">10000</span><span class="p">,</span> <span class="n">lr</span><span class="o">=</span><span class="mf">0.1</span><span class="p">,</span> <span class="n">momentum</span><span class="o">=</span><span class="mf">0.0</span><span class="p">),</span>
<span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s">'accuracy'</span><span class="p">])</span>
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span>
<span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">,</span>
<span class="n">epochs</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
<span class="n">batch_size</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span>
<span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">validation_data</span><span class="o">=</span><span class="p">(</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">))</span>
</code></pre></div></div>
<p>In the end we go from 25 epochs to 5 epochs in the simulations.</p>
<h2 id="preprocessing">Preprocessing</h2>
<p>There are few preprocessing optimisations one could also apply but that we wonâ€™t consider further here.</p>
<p>The first is to move the computation of the frozen layers to the input provider so that itâ€™s the output of the flatten layer that is shared with the servers instead of the pixels of the images. In this case the layers are said to perform <em>feature extraction</em> and we could potentially also use more powerful layers. However, if we want to keep the model proprietary then this adds significant complexity as the parameters now have to be distributed to the clients in some form.</p>
<p>Another typical approach to speed up training is to first apply dimensionality reduction techniques such as a <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">principal component analysis</a>. This approach is taken in the encrypted setting in <a href="https://eprint.iacr.org/2017/857">BSS+â€™17</a>.</p>
<h1 id="adapting-the-protocol">Adapting the Protocol</h1>
<p>Having looked at the model we next turn to the protocol: as well shall see, understanding the <a href="https://github.com/wiseodd/hipsternet">operations</a> we want to perform can help speed things up.</p>
<p>In particular, a lot of the computation can be moved to the crypto provider, whoâ€™s generated raw material is independent of the private inputs and to some extend even the model. As such, its computation may be done in advance whenever itâ€™s convenient and at large scale.</p>
<p>Recall from earlier that itâ€™s relevant to optimise both round and communication complexity, and the extensions suggested here are often aimed at improving these at the expense of additional local computation. As such, practical experiments are needed to validate their benefits under concrete conditions.</p>
<h2 id="dropout">Dropout</h2>
<p>Starting with the easiest type of layer, we notice that nothing special related to secure computation happens here, and the only thing is to make sure that the two servers agree on which values to drop in each training iteration. This can be done by simply agreeing on a seed value.</p>
<h2 id="average-pooling">Average pooling</h2>
<p>The forward pass of average pooling only requires a summation followed by a division with a public denominator. Hence, it can be implemented by a multiplication with a public value: since the denominator is public we can easily find its inverse and then simply multiply and truncate. Likewise, the backward pass is simply a scaling, and hence both directions are entirely local operations.</p>
<h2 id="dense-layers">Dense layers</h2>
<p>The dot product needed for both the forward and backward pass of dense layers can of course be implemented in the typical fashion using multiplication and addition. If we want to compute <code class="language-plaintext highlighter-rouge">dot(x, y)</code> for matrices <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> with shapes respectively <code class="language-plaintext highlighter-rouge">(m, k)</code> and <code class="language-plaintext highlighter-rouge">(k, n)</code> then this requires <code class="language-plaintext highlighter-rouge">m * n * k</code> multiplications, meaning we have to communicate the same number of masked values. While these can all be sent in parallel so we only need one round, if we allow ourselves to use another kind of preprocessed triple then we can reduce the communication cost by an order of magnitude.</p>
<p>For instance, the second dense layer in our model computes a dot product between a <code class="language-plaintext highlighter-rouge">(32, 128)</code> and a <code class="language-plaintext highlighter-rouge">(128, 5)</code> matrix. Using the typical approach requires sending <code class="language-plaintext highlighter-rouge">32 * 5 * 128 == 22400</code> masked values per batch, but by using the preprocessed triples described below we instead only have to send <code class="language-plaintext highlighter-rouge">32 * 128 + 5 * 128 == 4736</code> values, almost a factor 5 improvement. For the first dense layer it is even greater, namely slightly more than a factor 25.</p>
<p>As also noted <a href="/2017/09/10/the-spdz-protocol-part2/">previously</a>, the trick is to ensure that each private value in the matrices is only sent masked once. To make this work we need triples <code class="language-plaintext highlighter-rouge">(a, b, c)</code> of random matrices <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> with the appropriate shapes and such that <code class="language-plaintext highlighter-rouge">c == dot(a, b)</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_dot_triple</span><span class="p">(</span><span class="n">x_shape</span><span class="p">,</span> <span class="n">y_shape</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">sample_random_tensor</span><span class="p">(</span><span class="n">x_shape</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">sample_random_tensor</span><span class="p">(</span><span class="n">y_shape</span><span class="p">)</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">b</span><span class="p">),</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</code></pre></div></div>
<p>Given such a triple we can instead communicate the values of <code class="language-plaintext highlighter-rouge">alpha = x - a</code> and <code class="language-plaintext highlighter-rouge">beta = y - b</code> followed by a local computation to obtain <code class="language-plaintext highlighter-rouge">dot(x, y)</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateTensor</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">dot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicTensor</span><span class="p">:</span>
<span class="n">shares0</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">shares0</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">y</span><span class="o">.</span><span class="n">values</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">shares1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">shares1</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">y</span><span class="o">.</span><span class="n">values</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateTensor</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">shares0</span><span class="p">,</span> <span class="n">shares1</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateTensor</span><span class="p">:</span>
<span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">a_dot_b</span> <span class="o">=</span> <span class="n">generate_dot_triple</span><span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">y</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">a</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="n">beta</span> <span class="o">=</span> <span class="p">(</span><span class="n">y</span> <span class="o">-</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="k">return</span> <span class="n">alpha</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">alpha</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">b</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a_dot_b</span>
</code></pre></div></div>
<p>Security of using these triples follows the same argument as for multiplication triples: the communicated masked values perfectly hides <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> while <code class="language-plaintext highlighter-rouge">c</code> being an independent fresh sharing makes sure that the result cannot leak anything about its constitutes.</p>
<p>Note that this kind of triple is used in <a href="https://eprint.iacr.org/2017/396">SecureML</a>, which also give techniques allowing the servers to generate them without the help of the crypto provider.</p>
<h2 id="convolutions">Convolutions</h2>
<p>Like dense layers, convolutions can be treated either as a series of scalar multiplications or as <a href="http://cs231n.github.io/convolutional-networks/#conv">a matrix multiplication</a>, although the latter only after first expanding the tensor of training samples into a matrix with significant duplication. Unsurprisingly this leads to communication costs that in both cases can be improved by introducing another kind of triple.</p>
<p>As an example, the first convolution maps a tensor with shape <code class="language-plaintext highlighter-rouge">(m, 28, 28, 1)</code> to one with shape <code class="language-plaintext highlighter-rouge">(m, 28, 28, 32)</code> using <code class="language-plaintext highlighter-rouge">32</code> filters of shape <code class="language-plaintext highlighter-rouge">(3, 3, 1)</code> (excluding the bias vector). For batch size <code class="language-plaintext highlighter-rouge">m == 32</code> this means <code class="language-plaintext highlighter-rouge">7,225,344</code> communicated elements if weâ€™re using only scalar multiplications, and <code class="language-plaintext highlighter-rouge">226,080</code> if using a matrix multiplication. However, since there are only <code class="language-plaintext highlighter-rouge">(32*28*28) + (32*3*3) == 25,376</code> private values involved in total (again not counting bias since they only require addition), we see that there is roughly a factor <code class="language-plaintext highlighter-rouge">9</code> overhead. In other words, each private value is being masked and sent several times. With a new kind of triple we can remove this overhead and save on communication cost: for 64 bit elements this means <code class="language-plaintext highlighter-rouge">200KB</code> per batch instead of respectively <code class="language-plaintext highlighter-rouge">1.7MB</code> and <code class="language-plaintext highlighter-rouge">55MB</code>.</p>
<p>The triples <code class="language-plaintext highlighter-rouge">(a, b, c)</code> we need here are similar to those used in dot products, with <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> having shapes matching the two inputs, i.e. <code class="language-plaintext highlighter-rouge">(m, 28, 28, 1)</code> and <code class="language-plaintext highlighter-rouge">(32, 3, 3, 1)</code>, and <code class="language-plaintext highlighter-rouge">c</code> matching output shape <code class="language-plaintext highlighter-rouge">(m, 28, 28, 32)</code>.</p>
<h2 id="sigmoid-activations">Sigmoid activations</h2>
<p>As done <a href="/2017/04/17/private-deep-learning-with-mpc/#approximating-sigmoid">earlier</a>, we may use a degree-9 polynomial to approximate the sigmoid activation function with a sufficient level of accuracy. Evaluating this polynomial for a private value <code class="language-plaintext highlighter-rouge">x</code> requires computing a series of powers of <code class="language-plaintext highlighter-rouge">x</code>, which of course may be done by sequential multiplication â€“ but this means several rounds and corresponding amount of communication.</p>
<p>As an alternative we can again use a new kind of preprocessed triple that allows us to compute all required powers in a single round. As shown <a href="/2017/09/10/the-spdz-protocol-part2/">previously</a>, the length of these â€śtriplesâ€ť is not fixed but equals the highest exponent, such that a triple for e.g. squaring consists of independent sharings of <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">a**2</code>, while one for cubing consists of independent sharings of <code class="language-plaintext highlighter-rouge">a</code>, <code class="language-plaintext highlighter-rouge">a**2</code>, and <code class="language-plaintext highlighter-rouge">a**3</code>.</p>
<p>Once we have these powers of <code class="language-plaintext highlighter-rouge">x</code>, evaluating a polynomial with public coefficients is then just a local weighted sum. The security of this again follows from the fact that all powers in the triple are independently shared.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">pol_public</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">coeffs</span><span class="p">,</span> <span class="n">triple</span><span class="p">):</span>
<span class="n">powers</span> <span class="o">=</span> <span class="n">pows</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">triple</span><span class="p">)</span>
<span class="k">return</span> <span class="nb">sum</span><span class="p">(</span> <span class="n">xe</span> <span class="o">*</span> <span class="n">ce</span> <span class="k">for</span> <span class="n">xe</span><span class="p">,</span> <span class="n">ce</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">powers</span><span class="p">,</span> <span class="n">coeffs</span><span class="p">)</span> <span class="p">)</span>
</code></pre></div></div>
<p>We have the same caveat related to fixed-point precision as <a href="/2017/09/10/the-spdz-protocol-part2/">earlier</a> though, namely that we need more room for the higher precision of the powers: <code class="language-plaintext highlighter-rouge">x**n</code> has <code class="language-plaintext highlighter-rouge">n</code> times the precision of <code class="language-plaintext highlighter-rouge">x</code> and we want to make sure that it does not wrap around modulo <code class="language-plaintext highlighter-rouge">Q</code> since then we cannot decode correctly anymore. As done there, we can solve this by introducing a sufficiently larger field <code class="language-plaintext highlighter-rouge">P</code> to which we temporarily <a href="/2017/09/10/the-spdz-protocol-part2/">switch</a> while computing the powers, at the expense of two extra rounds of communication.</p>
<p>Practical experiments can show whether it best to stay in <code class="language-plaintext highlighter-rouge">Q</code> and use a few more multiplication rounds, or perform the switch and pay for conversion and arithmetic on larger numbers. Specifically, for low degree polynomials the former is likely better.</p>
<h1 id="proof-of-concept-implementation">Proof of Concept Implementation</h1>
<p>A <a href="https://github.com/mortendahl/privateml/tree/master/image-analysis/">proof-of-concept implementation</a> without networking is available for experimentation and reproducibility. Still a work in progress, the code currently supports training a new classifier from encrypted features, but not feature extraction on encrypted images. In other words, it assumes that the input providers themselves run their images through the feature extraction layers and send the results in encrypted form to the servers; as such, the weights for that part of the model are currently not kept private. A future version will address this and allow training and predictions directly from images by enabling the feature layers to also run on encrypted data.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">pond.nn</span> <span class="kn">import</span> <span class="n">Sequential</span><span class="p">,</span> <span class="n">Dense</span><span class="p">,</span> <span class="n">Sigmoid</span><span class="p">,</span> <span class="n">Dropout</span><span class="p">,</span> <span class="n">Reveal</span><span class="p">,</span> <span class="n">Softmax</span><span class="p">,</span> <span class="n">CrossEntropy</span>
<span class="kn">from</span> <span class="nn">pond.tensor</span> <span class="kn">import</span> <span class="n">PrivateEncodedTensor</span>
<span class="n">classifier</span> <span class="o">=</span> <span class="n">Sequential</span><span class="p">([</span>
<span class="n">Dense</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="mi">6272</span><span class="p">),</span>
<span class="n">Sigmoid</span><span class="p">(),</span>
<span class="n">Dropout</span><span class="p">(</span><span class="mf">.5</span><span class="p">),</span>
<span class="n">Dense</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">128</span><span class="p">),</span>
<span class="n">Reveal</span><span class="p">(),</span>
<span class="n">Softmax</span><span class="p">()</span>
<span class="p">])</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">initialize</span><span class="p">()</span>
<span class="n">classifier</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span>
<span class="n">PrivateEncodedTensor</span><span class="p">(</span><span class="n">x_train_features</span><span class="p">),</span>
<span class="n">PrivateEncodedTensor</span><span class="p">(</span><span class="n">y_train</span><span class="p">),</span>
<span class="n">loss</span><span class="o">=</span><span class="n">CrossEntropy</span><span class="p">(),</span>
<span class="n">epochs</span><span class="o">=</span><span class="mi">3</span>
<span class="p">)</span>
</code></pre></div></div>
<p>The code is split into several Python notebooks, and comes with a set of precomputed weights that allows for skipping some of the steps:</p>
<ul>
<li>
<p>The first one deals with <a href="https://github.com/mortendahl/privateml/tree/master/image-analysis/Pre-training.ipynb">pre-training on the public data</a> using Keras, and produces the model used for feature extraction. This step can be skipped by using the repositoryâ€™s precomputed weights instead.</p>
</li>
<li>
<p>The second one applies the above model to do <a href="https://github.com/mortendahl/privateml/tree/master/image-analysis/Feature%20extraction.ipynb">feature extraction on the private data</a>, thereby producing the features used for training the new encrypted classifier. In future versions this will be done by first encrypting the data. This step cannot be skipped as the extracted data is too large.</p>
</li>
<li>
<p>The third takes the extracted features and <a href="https://github.com/mortendahl/privateml/tree/master/image-analysis/Fine-tuning.ipynb">trains a new encrypted classifier</a>. This is by far the most expensive step and may be skipped by using the repositoryâ€™s precomputed weights instead.</p>
</li>
<li>
<p>Finally, the fourth notebook uses the new classifier to perform <a href="https://github.com/mortendahl/privateml/tree/master/image-analysis/Prediction.ipynb">encrypted predictions</a> from new images. Again feature extraction is currently done unencrypted.</p>
</li>
</ul>
<p>Running the code is a matter of cloning the repository</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git clone https://github.com/mortendahl/privateml.git <span class="o">&&</span> <span class="se">\</span>
<span class="nb">cd </span>privateml/image-analysis/
</code></pre></div></div>
<p>installing the dependencies</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>pip3 <span class="nb">install </span>jupyter numpy tensorflow keras h5py
</code></pre></div></div>
<p>launching a notebook</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>jupyter notebook
</code></pre></div></div>
<p>and navigating to either of the four notebooks mentioned above.</p>
<!--
## Running on GCE
Since especially the encrypted training is a rather lengthy process, it might be worth running at least this part on e.g. a remote cloud instance. To use the [Google Compute Engine](https://cloud.google.com/compute/) one can do the following, after setting up [`gcloud`](https://cloud.google.com/sdk/) (which is also available in Homebrew as `brew cask info google-cloud-sdk`).
We first set up a fresh compute instance to function as out notebook server and connect to it.
```bash
laptop$ gcloud compute instances create server \
--custom-cpu=1 \
--custom-memory=6GB
laptop$ gcloud compute ssh server -- -L 8888:localhost:8888
```
Once connected we install dependencies, pull down the notebooks, and launch Jupyter. Note that we do the latter in a screen to let the notebook computations run even if we disconnect our SSH session.
```bash
server$ sudo apt-get update && \
sudo apt-get install -y python3 python3-pip git && \
sudo pip3 install jupyter numpy tensorflow keras
server$ git clone https://github.com/mortendahl/privateml.git && \
cd privateml/image-analysis/
server$ screen jupyter notebook
```
```bash
```
```bash
server$ screen jupyter notebook
```
`ctrl+a d`
```bash
laptop$ gcloud compute ssh server -- -L 8888:localhost:8888
server$ screen -r
```
```bash
## Stop GCP instance
gcloud compute instances stop server
```
-->
<h1 id="thoughts">Thoughts</h1>
<p>As always, when previous thoughts and questions have been answered there is already a new batch waiting.</p>
<h2 id="generalised-triples">Generalised triples</h2>
<p>When seeking to reduce communication, one may also wonder how much can be pushed to the preprocessing phase in the form of additional types of triples.</p>
<p>As mentioned several times (and also suggested in e.g. <a href="https://eprint.iacr.org/2017/1234">BCG+â€™17</a>), we typically seek to ensure that each private value is only sent masked once. So if we are e.g. computing both <code class="language-plaintext highlighter-rouge">dot(x, y)</code> and <code class="language-plaintext highlighter-rouge">dot(x, z)</code> then it might make sense to have a triple <code class="language-plaintext highlighter-rouge">(r, s, t, u, v)</code> where <code class="language-plaintext highlighter-rouge">r</code> is used to mask <code class="language-plaintext highlighter-rouge">x</code>, <code class="language-plaintext highlighter-rouge">s</code> to mask <code class="language-plaintext highlighter-rouge">y</code>, <code class="language-plaintext highlighter-rouge">u</code> to mask <code class="language-plaintext highlighter-rouge">z</code>, and <code class="language-plaintext highlighter-rouge">t</code> and <code class="language-plaintext highlighter-rouge">u</code> are used to compute the result. This pattern happens during training for instance, where values computed during the forward pass are sometimes cached and reused during the backward pass.</p>
<p>Perhaps more importantly though is when we are only making predictions with a model, i.e. computing with fixed private weights. In this case we only want to <a href="/2017/09/10/the-spdz-protocol-part2">mask the weights once and then reuse</a> these for each prediction. Doing so means we only have to mask and communicate proportionally to the input tensor flowing through the model, as opposed to propotionally to both the input tensor and the weights, as also done in e.g. <a href="https://arxiv.org/abs/1801.05507">JVCâ€™18</a>. More generally, we ideally want to communicate proportionally only to the values that change, which can be achieved (in an amortised sense) using tailored triples.</p>
<p>Finally, it is in principle also possible to have <a href="/2017/09/10/the-spdz-protocol-part2">triples for more advanced functions</a> such as evaluating both a dense layer and its activation function with a single round of communication, but the big obstacle here seems to be scalability in terms of triple storage and amount of computation needed for the recombination step, especially when working with tensors.</p>
<h2 id="activation-functions">Activation functions</h2>
<p>A natural question is which of the other typical activation functions are efficient in the encrypted setting. As mentioned above, <a href="https://eprint.iacr.org/2017/396">SecureML</a> makes use of ReLU by temporarily switching to garbled circuits, and <a href="https://arxiv.org/abs/1711.05189">CryptoDL</a> gives low-degree polynomial approximations to both Sigmoid, ReLU, and Tanh (using <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials">Chebyshev polynomials</a> for <a href="http://www.chebfun.org/docs/guide/guide04.html#47-the-runge-phenomenon">better accuracy</a>).</p>
<p>It may also be relevant to consider non-typical but simpler activations functions, such as squaring as in e.g. <a href="https://www.microsoft.com/en-us/research/publication/cryptonets-applying-neural-networks-to-encrypted-data-with-high-throughput-and-accuracy/">CryptoNets</a>, if for nothing else than simplifying both computation and communication.</p>
<h2 id="garbled-circuits">Garbled circuits</h2>
<p>While mentioned above only as a way of securely evaluating more advanced activation functions, <a href="https://oblivc.org/">garbled</a> <a href="https://github.com/encryptogroup/ABY">circuits</a> could in fact also be used for larger parts, including as the main means of secure computation as done in for instance <a href="https://arxiv.org/abs/1705.08963">DeepSecure</a>.</p>
<p>Compared to e.g. SPDZ this technique has the benefit of using only a constant number of communication rounds. The downside is that operations are now often happening on bits instead of on larger field elements, meaning more computation is involved.</p>
<h2 id="precision">Precision</h2>
<p>A lot of the research around <a href="https://research.googleblog.com/2017/04/federated-learning-collaborative.html">federated learning</a> involve <a href="https://arxiv.org/abs/1610.05492">gradient compression</a> in order to save on communication cost. Closer to our setting we have <a href="https://eprint.iacr.org/2017/1114">BMMPâ€™17</a> which uses quantization to apply homomorphic encryption to deep learning, and even <a href="https://arxiv.org/abs/1610.02132">unencrypted</a> <a href="https://www.tensorflow.org/performance/quantization">production-ready</a> systems often consider this technique as a way of improving performance also in terms of <a href="https://ai.intel.com/lowering-numerical-precision-increase-deep-learning-performance/">learning</a>.</p>
<h2 id="floating-point-arithmetic">Floating point arithmetic</h2>
<p>Above we used a fixed-point encoding of real numbers into field elements, yet unencrypted deep learning is typically using a floating point encoding. As shown in <a href="https://eprint.iacr.org/2012/405">ABZSâ€™12</a> and <a href="https://github.com/bristolcrypto/SPDZ-2/issues/7">the reference implementation of SPDZ</a>, it is also possible to use the latter in the encrypted setting, apparently with performance advantages for certain operations.</p>
<h2 id="gpus">GPUs</h2>
<p>Since deep learning is typically done on GPUs today for performance reasons, it is natural to consider whether similar speedups can be achieved by applying them in MPC computations. Some <a href="https://www.cs.virginia.edu/~shelat/papers/hms13-gpuyao.pdf">work</a> exist on this topic for garbled circuits, yet it seems less popular in the secret sharing setting of e.g. SPDZ.</p>
<p>Biggest problem here might be maturity and availability of arbitrary precision arithmetic on GPUs (but see e.g. <a href="http://www.comp.hkbu.edu.hk/~chxw/fgc_2010.pdf">this</a> and <a href="https://github.com/skystar0227/CUMP">that</a>) as needed for computations on field elements larger than e.g. 64 bits. Two things might be worth keeping in mind here though: firstly, while the values we compute on are larger than those natively supported, they are still bounded by the modulus; and secondly, we can do our secure computations over a ring instead of a field.</p>
<!--
One potential remedy is to decompose numbers using the [CRT](https://en.wikipedia.org/wiki/Chinese_remainder_theorem) into several components that are computed on in parallel. For this to work we would need to do our computations over a ring instead of a field, since our modulus must now be a composite number as opposed to a prime.
-->
<!--
# Old
https://eprint.iacr.org/2017/262.pdf
Pooling in MPC:
- doing entirely out of fashion: https://arxiv.org/abs/1412.6806 and http://cs231n.github.io/convolutional-networks/ -- use larger stride in CONV layer once in a while
in numpy:
- https://wiseodd.github.io/techblog/2016/07/16/convnet-conv-layer/
- https://github.com/andersbll/nnet
Gradient Compression
- https://arxiv.org/pdf/1610.02132.pdf
https://eprint.iacr.org/2016/1117.pdf
- https://stackoverflow.com/questions/36515202/why-is-the-cross-entropy-method-preferred-over-mean-squared-error-in-what-cases
- https://jamesmccaffrey.wordpress.com/2013/11/05/why-you-should-use-cross-entropy-error-instead-of-classification-error-or-mean-squared-error-for-neural-network-classifier-training/
-->Morten DahlTL;DR: we take a typical CNN deep learning model and go through a series of steps that enable both training and prediction to instead be done on encrypted data.The SPDZ Protocol, Part 22017-09-10T12:00:00+00:002017-09-10T12:00:00+00:00https://mortendahl.github.io/2017/09/10/the-spdz-protocol-part2<p><em><strong>This post is still very much a work in progress.</strong></em></p>
<p><em><strong>TL;DR:</strong> â€¦ </em></p>
<h1 id="triples">Triples</h1>
<h2 id="underlying-principle">Underlying principle</h2>
<p><em>(we turn our operation into a linear operation between private shares and public information; illustrate with mul; SageMath?)</em></p>
<h2 id="squaring">Squaring</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_square_triple</span><span class="p">():</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span>
<span class="n">aa</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="n">aa</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateValue</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">square</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="n">a</span><span class="p">,</span> <span class="n">aa</span> <span class="o">=</span> <span class="n">generate_square_triple</span><span class="p">()</span>
<span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">a</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="k">return</span> <span class="n">alpha</span><span class="o">.</span><span class="n">square</span><span class="p">()</span> <span class="o">+</span> \
<span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">alpha</span><span class="p">)</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> \
<span class="n">aa</span>
</code></pre></div></div>
<h2 id="dot">Dot</h2>
<h2 id="powering">Powering</h2>
<p>As an alternative we can again use a new kind of preprocessed triple that allows exponentiation to all required powers to be done in a single round. The length of these â€śtriplesâ€ť is not fixed but equals the highest exponent, such that a triple for squaring, for instance, consists of independent sharings of <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">a^2</code>, while one for cubing consists of independent sharings of <code class="language-plaintext highlighter-rouge">a</code>, <code class="language-plaintext highlighter-rouge">a^2</code>, and <code class="language-plaintext highlighter-rouge">a^3</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_pows_triple</span><span class="p">(</span><span class="n">exponent</span><span class="p">,</span> <span class="n">shape</span><span class="p">):</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randint</span><span class="p">(</span><span class="n">Q</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">shape</span><span class="p">)</span>
<span class="k">return</span> <span class="p">[</span> <span class="n">share</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">power</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">e</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">exponent</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> <span class="p">]</span>
</code></pre></div></div>
<p>To use these we notice that if <code class="language-plaintext highlighter-rouge">epsilon = x - a</code> then <code class="language-plaintext highlighter-rouge">x^n == (epsilon + a)^n</code>, which by <a href="https://en.wikipedia.org/wiki/Binomial_theorem">the binomal theorem</a> may be expressed as a weighted sum of <code class="language-plaintext highlighter-rouge">epsilon^n * a^0</code>, â€¦, <code class="language-plaintext highlighter-rouge">epsilon^0 * a^n</code> using the <a href="https://en.wikipedia.org/wiki/Binomial_coefficient">binomial coefficients</a> as weights. For instance, we have <code class="language-plaintext highlighter-rouge">x^3 == (c0 * epsilon^3) + (c1 * epsilon^2 * a) + (c2 * epsilon * a^2) + (c3 * a^3)</code> with <code class="language-plaintext highlighter-rouge">ck = C(3, k)</code>.</p>
<p>Moreover, a triple for e.g. cubing <code class="language-plaintext highlighter-rouge">x</code> can also simultaneously be used for squaring <code class="language-plaintext highlighter-rouge">x</code> simply by skipping some powers and computing different binomial coefficients. Hence, all intermediate powers may be computed using a single triple and communication of one field element. The security of this again follows from the fact that all powers in the triple are independently shared.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">pows</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">triple</span><span class="p">):</span>
<span class="c1"># local masking
</span> <span class="n">a</span> <span class="o">=</span> <span class="n">triple</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">v</span> <span class="o">=</span> <span class="n">sub</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">a</span><span class="p">)</span>
<span class="c1"># communication: the players simultanously send their share to the other
</span> <span class="n">epsilon</span> <span class="o">=</span> <span class="n">reconstruct</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>
<span class="c1"># local combination to compute all powers
</span> <span class="n">x_powers</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">exponent</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">triple</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span><span class="p">):</span>
<span class="c1"># prepare all term values
</span> <span class="n">a_powers</span> <span class="o">=</span> <span class="p">[</span><span class="n">ONE</span><span class="p">]</span> <span class="o">+</span> <span class="n">triple</span><span class="p">[:</span><span class="n">exponent</span><span class="p">]</span>
<span class="n">e_powers</span> <span class="o">=</span> <span class="p">[</span> <span class="nb">pow</span><span class="p">(</span><span class="n">epsilon</span><span class="p">,</span> <span class="n">e</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">exponent</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> <span class="p">]</span>
<span class="n">coeffs</span> <span class="o">=</span> <span class="p">[</span> <span class="n">binom</span><span class="p">(</span><span class="n">exponent</span><span class="p">,</span> <span class="n">k</span><span class="p">)</span> <span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">exponent</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> <span class="p">]</span>
<span class="c1"># compute and sum terms
</span> <span class="n">terms</span> <span class="o">=</span> <span class="p">(</span> <span class="n">mul_public</span><span class="p">(</span><span class="n">a</span><span class="p">,</span><span class="n">e</span><span class="o">*</span><span class="n">c</span><span class="p">)</span> <span class="k">for</span> <span class="n">a</span><span class="p">,</span><span class="n">e</span><span class="p">,</span><span class="n">c</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">a_powers</span><span class="p">,</span><span class="nb">reversed</span><span class="p">(</span><span class="n">e_powers</span><span class="p">),</span><span class="n">coeffs</span><span class="p">)</span> <span class="p">)</span>
<span class="n">x_powers</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">:</span> <span class="n">add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">),</span> <span class="n">terms</span><span class="p">))</span>
<span class="k">return</span> <span class="n">x_powers</span>
</code></pre></div></div>
<h2 id="share-conversion">Share conversion</h2>
<p>There is one caveat however, and that is that we now need room for the higher precision of the powers: <code class="language-plaintext highlighter-rouge">x^n</code> has <code class="language-plaintext highlighter-rouge">n</code> times the precision of <code class="language-plaintext highlighter-rouge">x</code> and we want to make sure that this value does not wrap around modulo <code class="language-plaintext highlighter-rouge">Q</code>.</p>
<p>One way around this is to temporarily switch to a larger field and compute the powers and truncation there. The conversion to and from this larger field <code class="language-plaintext highlighter-rouge">P</code> each take one round of communication, so polynomial evaluation ends up taking a total of three rounds.</p>
<p>Security wise we also have to pay a small price, although from a practical perspective there is little difference. In particular, for this operation we rely on <em>statistical security</em> instead of perfect security: since <code class="language-plaintext highlighter-rouge">r</code> is not an uniform random element here, thereâ€™s a tiny risk that something will be leaked about <code class="language-plaintext highlighter-rouge">x</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_statistical_mask</span><span class="p">():</span>
<span class="k">return</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="mi">2</span><span class="o">*</span><span class="n">BOUND</span> <span class="o">*</span> <span class="mi">10</span><span class="o">**</span><span class="n">KAPPA</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">generate_zero_triple</span><span class="p">(</span><span class="n">field</span><span class="p">):</span>
<span class="k">return</span> <span class="n">share</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">field</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">convert</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">from_field</span><span class="p">,</span> <span class="n">to_field</span><span class="p">,</span> <span class="n">zero_triple</span><span class="p">):</span>
<span class="c1"># local mapping to positive representation
</span> <span class="n">x</span> <span class="o">=</span> <span class="n">add_public</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">BOUND</span><span class="p">,</span> <span class="n">from_field</span><span class="p">)</span>
<span class="c1"># local masking and conversion by player 0
</span> <span class="n">r</span> <span class="o">=</span> <span class="n">generate_statistical_mask</span><span class="p">()</span>
<span class="n">y0</span> <span class="o">=</span> <span class="p">(</span><span class="n">zero_triple</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="n">r</span><span class="p">)</span> <span class="o">%</span> <span class="n">to_field</span>
<span class="c1"># exchange of masked share: one round of communication
</span> <span class="n">e</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">r</span><span class="p">)</span> <span class="o">%</span> <span class="n">from_field</span>
<span class="c1"># local conversion by player 1
</span> <span class="n">xr</span> <span class="o">=</span> <span class="p">(</span><span class="n">e</span> <span class="o">+</span> <span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="o">%</span> <span class="n">from_field</span>
<span class="n">y1</span> <span class="o">=</span> <span class="p">(</span><span class="n">zero_triple</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">xr</span><span class="p">)</span> <span class="o">%</span> <span class="n">to_field</span>
<span class="c1"># local mapping back from positive representation
</span> <span class="n">y</span> <span class="o">=</span> <span class="p">[</span><span class="n">y0</span><span class="p">,</span> <span class="n">y1</span><span class="p">]</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">sub_public</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">BOUND</span><span class="p">,</span> <span class="n">to_field</span><span class="p">)</span>
<span class="k">return</span> <span class="n">y</span>
<span class="k">def</span> <span class="nf">upshare</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">large_zero_triple</span><span class="p">):</span>
<span class="k">return</span> <span class="n">convert</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">Q</span><span class="p">,</span> <span class="n">P</span><span class="p">,</span> <span class="n">large_zero_triple</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">downshare</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">small_zero_triple</span><span class="p">):</span>
<span class="k">return</span> <span class="n">convert</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">P</span><span class="p">,</span> <span class="n">Q</span><span class="p">,</span> <span class="n">small_zero_triple</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that we could of course decide to simply do all computations in the larger field <code class="language-plaintext highlighter-rouge">P</code>, thereby avoiding the conversion steps. This will likely slow down the local computations by a non-trivial factor however, as we may need arbitrary precision arithmetic for <code class="language-plaintext highlighter-rouge">P</code> as opposed to e.g. 64 bit native arithmetic for <code class="language-plaintext highlighter-rouge">Q</code>.</p>
<p>Practical experiments will show whether it best to stay in <code class="language-plaintext highlighter-rouge">Q</code> and use a few more rounds, or switch temporarily to <code class="language-plaintext highlighter-rouge">P</code> and pay for conversion and arbitrary precision arithmetic. Specifically, for low degree polynomials the former is likely better.</p>
<h2 id="generalised-triples">Generalised triples</h2>
<p><strong>TODO</strong> only send the masking of a value once; reuse masked version until the variable is updated</p>
<p>When seeking to reduce communication, one may also wonder how much can be pushed to the preprocessing phase in the form of additional types of triples.</p>
<p>As mentioned earlier, we might seek to ensure that each private value is only sent masked once. So if we are e.g. computing both <code class="language-plaintext highlighter-rouge">dot(X, Y)</code> and <code class="language-plaintext highlighter-rouge">dot(X, Z)</code> then it might make sense to have a triple <code class="language-plaintext highlighter-rouge">(R, S, T, U, V)</code> that allows us to compute both results yet only send <code class="language-plaintext highlighter-rouge">X</code> masked once, as done in e.g. <a href="https://eprint.iacr.org/2017/1234">BCG+â€™17</a>.</p>
<p>One relevant case is training, where some values are used to compute both the output of the layer during the forward phase, but also typically cached and used again to update the weights during the backward phase (for instance in dense layers).</p>
<p>Another, perhaps more important case, is if we are only interested in during prediction: TODO TODO TODO</p>
<p>Additionally, it might also be possible to have triples for more advanced functions such as evaluating both a dense layer and its activation function with a single round of communication. Main question here again seems to be efficiency, this time in terms of triple storage and amount of computation needed for the recombination step.</p>
<!--
https://www1.cs.fau.de/filepool/publications/octavian_securescm/smcint-scn10.pdf
https://www.iacr.org/archive/pkc2007/44500343/44500343.pdf
-->Morten DahlThis post is still very much a work in progress.The SPDZ Protocol, Part 12017-09-03T12:00:00+00:002017-09-03T12:00:00+00:00https://mortendahl.github.io/2017/09/03/the-spdz-protocol-part1<p><em><strong>This post is still very much a work in progress.</strong></em></p>
<p><em><strong>TL;DR:</strong> this is the first in a series of posts explaining a state-of-the-art protocol for secure computation.</em></p>
<p>In this blog post weâ€™ll go through the state-of-the-art SPDZ protocol for secure computation. Unlike the protocol used in <a href="/2017/04/17/private-deep-learning-with-mpc/">a previous blog post</a>, SPDZ allows us to have as few as two parties computing on private values. Moreover, it has received significant scientific attention over the last few years and as a result several optimisations are known that can used to speed up our computation.</p>
<p>In this series weâ€™ll go through and describe the state-of-the-art SPDZ protocol for secure computation. Unlike the protocol used in <a href="/2017/04/17/private-deep-learning-with-mpc/">a previous blog post</a>, SPDZ allows us to have as few as two parties computing on private values and it allows us to move parts of the computation to an <em>offline</em> phase in order to gain a more performant <em>online</em> phase. Moreover, it has received significant scientific attention over the last few years that resulted in various optimisations and efficient implementations.</p>
<p>The code for this section is available in <a href="https://github.com/mortendahl/privateml/blob/master/image-analysis/Basic%20SPDZ.ipynb">this associated notebook</a>.</p>
<h1 id="background">Background</h1>
<p>The protocol was first described in <a href="https://eprint.iacr.org/2011/535">SPZDâ€™12</a> and <a href="https://eprint.iacr.org/2012/642">DKLPSSâ€™13</a>, but have also been the subject of at least <a href="https://bristolcrypto.blogspot.fr/2016/10/what-is-spdz-part-1-mpc-circuit.html">one series of blog posts</a>. Several implementations exist, including <a href="https://www.cs.bris.ac.uk/Research/CryptographySecurity/SPDZ/">one</a> from the <a href="http://www.cs.bris.ac.uk/Research/CryptographySecurity/">cryptography group</a> at the University of Bristol providing both high performance and full active security.</p>
<p>As usual, all computations take place in a finite ring, often identified by a prime modulus <code class="language-plaintext highlighter-rouge">Q</code>. As we will see, this means we also need a way to encode the fixed-point numbers used by the CNNs as integers modulo a prime, and we have to take care that these never â€śwrap aroundâ€ť as we then may not be able to recover the correct result.</p>
<p>Moreover, while the computational resources used by a procedure is often only measured in time complexity, i.e. the time it takes the CPU to perform the computation, with interactive computations such as the SPDZ protocol it also becomes relevant to consider communication and round complexity. The former measures the number of bits sent across the network, which is a relatively slow process, and the latter the number of synchronisation points needed between the two parties, which may block one of them with nothing to do until the other catches up. Both hence also have a big impact on overall executing time.</p>
<p>Concretely, we have an interest in keeping <code class="language-plaintext highlighter-rouge">Q</code> is small as possible, not only because we can then do arithmetic operations using only a single word sized operations (as opposed to arbitrary precision arithmetic which is significantly slower), but also because we have to transmit less bits when sending field elements across the network.</p>
<p>Note that while the protocol in general supports computations between any number of parties we here present it for the two-party setting only. Moreover, as mentioned earlier, we aim only for passive security and assume a crypto provider that will honestly generate the needed triples.</p>
<p>Note that while the protocol in general supports computations between any number of parties we here use and specialise it for the two-party setting only. Moreover, as mentioned earlier, we aim only for passive security and assume a crypto provider that will honestly generate the needed triples.</p>
<h1 id="setting">Setting</h1>
<p>We will assume that the training data set is jointly held by a set of <em>input providers</em> and that the training is performed by two distinct <em>servers</em> (or <em>parties</em>) that are trusted not to collaborate beyond what our protocol specifies. In practice, these servers could for instance be virtual instances in a shared cloud environment operated by two different organisations.</p>
<p>The input providers are only needed in the very beginning to transmit their training data; after that all computations involve only the two servers, meaning it is indeed plausible for the input providers to use e.g. mobile phones. Once trained, the model will remain jointly held in encrypted form by the two servers where anyone can use it to make further encrypted predictions.</p>
<p>For technical reasons we also assume a distinct <em>crypto producer</em> that generates certain raw material used during the computation for increased efficiency; there are ways to eliminate this additional entity but we wonâ€™t go into that here.</p>
<p>Finally, in terms of security we aim for a typical notion used in practice, namely <em>honest-but-curious (or passive) security</em>, where the servers are assumed to follow the protocol but may otherwise try to learn as much possible from the data they see. While a slightly weaker notion than <em>fully malicious (or active) security</em> with respect to the servers, this still gives strong protection against anyone who may compromise one of the servers <em>after</em> the computations, despite what they do. Note that for the purpose of this blog post we will actually allow a small privacy leakage during training as detailed later.</p>
<h1 id="secure-computation-with-spdz">Secure Computation with SPDZ</h1>
<h2 id="sharing-and-reconstruction">Sharing and reconstruction</h2>
<p>Sharing a private value between the two servers is done using the simple <a href="/2017/06/04/secret-sharing-part1/#additive-sharing">additive scheme</a>. This may be performed by anyone, including an input provider, and keeps the value <a href="https://en.wikipedia.org/wiki/Information-theoretic_security">perfectly private</a> as long as the servers are not colluding.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">share</span><span class="p">(</span><span class="n">secret</span><span class="p">):</span>
<span class="n">share0</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span>
<span class="n">share1</span> <span class="o">=</span> <span class="p">(</span><span class="n">secret</span> <span class="o">-</span> <span class="n">share0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="p">[</span><span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">]</span>
</code></pre></div></div>
<p>And when specified by the protocol, the private value can be reconstruct by a server sending his share to the other.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">reconstruct</span><span class="p">(</span><span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">):</span>
<span class="k">return</span> <span class="p">(</span><span class="n">share0</span> <span class="o">+</span> <span class="n">share1</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
</code></pre></div></div>
<p>Of course, if both parties are to learn the private value then they can send their share simultaneously and hence still only use one round of communication.</p>
<p>Note that the use of an additive scheme means the servers are required to be highly robust, unlike e.g. <a href="/2017/06/04/secret-sharing-part1/">Shamirâ€™s scheme</a> which may handle some servers dropping out. If this is a reasonable assumption though, then additive sharing provides significant advantages.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateValue</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">share0</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">share1</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">value</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">share0</span><span class="p">,</span> <span class="n">share1</span> <span class="o">=</span> <span class="n">share</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">share0</span> <span class="o">=</span> <span class="n">share0</span>
<span class="bp">self</span><span class="o">.</span><span class="n">share1</span> <span class="o">=</span> <span class="n">share1</span>
<span class="k">def</span> <span class="nf">reconstruct</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="n">PublicValue</span><span class="p">(</span><span class="n">reconstruct</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">share0</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">share1</span><span class="p">))</span>
</code></pre></div></div>
<h2 id="linear-operations">Linear operations</h2>
<p>Having obtained sharings of private values we may next perform certain operations on these. The first set of these is what we call linear operations since they allow us to form linear combinations of private values.</p>
<p>The first are addition and subtraction, which are simple local computations on the shares already held by each server. And if one of the values is public then we may simplify.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateValue</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicValue</span><span class="p">:</span>
<span class="n">share0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share0</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">value</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">share1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">share1</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateValue</span><span class="p">:</span>
<span class="n">share0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share0</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">share0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">share1</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share1</span> <span class="o">+</span> <span class="n">y</span><span class="o">.</span><span class="n">share1</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">sub</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicValue</span><span class="p">:</span>
<span class="n">share0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share0</span> <span class="o">-</span> <span class="n">y</span><span class="o">.</span><span class="n">value</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">share1</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">share1</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateValue</span><span class="p">:</span>
<span class="n">share0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share0</span> <span class="o">-</span> <span class="n">y</span><span class="o">.</span><span class="n">share0</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">share1</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share1</span> <span class="o">-</span> <span class="n">y</span><span class="o">.</span><span class="n">share1</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="k">assert</span> <span class="n">z</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span> <span class="o">==</span> <span class="mi">8</span>
</code></pre></div></div>
<p>Next we may also perform multiplication with a public value by again only performing a local operation on the share already held by each server.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateValue</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">mul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicValue</span><span class="p">:</span>
<span class="n">share0</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share0</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">value</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">share1</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">share1</span> <span class="o">*</span> <span class="n">y</span><span class="o">.</span><span class="n">value</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">share0</span><span class="p">,</span> <span class="n">share1</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that the security of these operations is straight-forward since no communication is taking place between the two parties and hence nothing new could have been revealed.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="mi">5</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">PublicValue</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">x</span> <span class="o">*</span> <span class="n">y</span>
<span class="k">assert</span> <span class="n">z</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span> <span class="o">==</span> <span class="mi">15</span>
</code></pre></div></div>
<h2 id="multiplication">Multiplication</h2>
<p>Multiplication of two private values is where we really start to deviate from the protocol used <a href="/2017/04/17/private-deep-learning-with-mpc/">previously</a>. The techniques used there inherently need at least three parties so wonâ€™t be much help in our two party setting.</p>
<p>Perhaps more interesting though, is that the new techniques used here allow us to shift parts of the computation to an <em>offline phase</em> where <em>raw material</em> that doesnâ€™t depend on any of the private values can be generated at convenience. As we shall see later, this can be used to significantly speed up the <em>online phase</em> where training and prediction is taking place.</p>
<p>This raw material is popularly called a <em>multiplication triple</em> (and sometimes <em>Beaver triple</em> due to their introduction in <a href="https://scholar.google.com/scholar?cluster=14306306930077045887">Beaverâ€™91</a>) and consists of independent sharings of three values <code class="language-plaintext highlighter-rouge">a</code>, <code class="language-plaintext highlighter-rouge">b</code>, and <code class="language-plaintext highlighter-rouge">c</code> such that <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> are uniformly random values and <code class="language-plaintext highlighter-rouge">c == a * b % Q</code>. Here we assume that these triples are generated by the crypto provider, and the resulting shares distributed to the two parties ahead of running the online phase. In other words, when performing a multiplication we assume that <code class="language-plaintext highlighter-rouge">Pi</code> already knows <code class="language-plaintext highlighter-rouge">a[i]</code>, <code class="language-plaintext highlighter-rouge">b[i]</code>, and <code class="language-plaintext highlighter-rouge">c[i]</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_mul_triple</span><span class="p">():</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span>
<span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">b</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="n">b</span><span class="p">),</span> <span class="n">PrivateValue</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that a large portion of efforts in <a href="https://eprint.iacr.org/2016/505">current</a> <a href="https://eprint.iacr.org/2017/1230">research</a> and the <a href="https://www.cs.bris.ac.uk/Research/CryptographySecurity/SPDZ/">full reference implementation</a> is spent on removing the crypto provider and instead letting the parties generate these triples on their own; we wonâ€™t go into that here but see the resources pointed to earlier for details.</p>
<p>To use multiplication triples to compute the product of two private values <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> we proceed as follows. The idea is simply to use <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> to respectively mask <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> and then reconstruct the masked values as respectively <code class="language-plaintext highlighter-rouge">alpha</code> and <code class="language-plaintext highlighter-rouge">beta</code>. As public values, <code class="language-plaintext highlighter-rouge">alpha</code> and <code class="language-plaintext highlighter-rouge">beta</code> may then be combined locally by each server to form a sharing of <code class="language-plaintext highlighter-rouge">z == x * y</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PrivateValue</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">mul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PublicValue</span><span class="p">:</span>
<span class="o">...</span>
<span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">y</span><span class="p">)</span> <span class="ow">is</span> <span class="n">PrivateValue</span><span class="p">:</span>
<span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">a_mul_b</span> <span class="o">=</span> <span class="n">generate_mul_triple</span><span class="p">()</span>
<span class="c1"># local masking followed by communication of the reconstructed values
</span> <span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="n">a</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="n">beta</span> <span class="o">=</span> <span class="p">(</span><span class="n">y</span> <span class="o">-</span> <span class="n">b</span><span class="p">)</span><span class="o">.</span><span class="n">reconstruct</span><span class="p">()</span>
<span class="c1"># local re-combination
</span> <span class="k">return</span> <span class="n">alpha</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">alpha</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">b</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a</span><span class="o">.</span><span class="n">mul</span><span class="p">(</span><span class="n">beta</span><span class="p">)</span> <span class="o">+</span> \
<span class="n">a_mul_b</span>
</code></pre></div></div>
<p>If we write out the equations we see that <code class="language-plaintext highlighter-rouge">alpha * beta == xy - xb - ay + ab</code>, <code class="language-plaintext highlighter-rouge">a * beta == ay - ab</code>, and <code class="language-plaintext highlighter-rouge">b * alpha == bx - ab</code>, so that the sum of these with <code class="language-plaintext highlighter-rouge">c</code> cancels out everything except <code class="language-plaintext highlighter-rouge">xy</code>. In terms of complexity we see that communication of two field elements in one round is required.</p>
<p>Finally, since <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are <a href="https://en.wikipedia.org/wiki/Information-theoretic_security">perfectly hidden</a> by <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code>, neither server learns anything new as long as each triple is only used once. Moreover, the newly formed sharing of <code class="language-plaintext highlighter-rouge">z</code> is â€śfreshâ€ť in the sense that it contains no information about the sharings of <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> that were used in its construction, since the sharing of <code class="language-plaintext highlighter-rouge">c</code> was independent of the sharings of <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code>.</p>
<h1 id="next-steps">Next Steps</h1>
<!--
# Dump
TODO
https://www.youtube.com/watch?v=N80DV3Brds0
https://www.youtube.com/watch?v=Ce45hp24b2E
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480149/
https://github.com/vyomshm/predicting-coronary-heart-disease-with-tensorflow-and-tensorboard
-->Morten DahlThis post is still very much a work in progress.Secret Sharing, Part 32017-08-13T12:00:00+00:002017-08-13T12:00:00+00:00https://mortendahl.github.io/2017/08/13/secret-sharing-part3<p><em><strong>TL;DR:</strong> due to redundancy in the way shares are generated, we can compensate not only for some of them being lost but also for some being manipulated; here we look at how to do this using decoding methods for Reed-Solomon codes.</em></p>
<p>Returning to our motivation in <a href="/2017/06/04/secret-sharing-part1/">part one</a> for using secret sharing, namely to distribute trust, we recall that the generated shares are given to shareholders that we may not trust individually. As such, if we later ask for the shares back in order to reconstruct the secret then it is natural to consider how reasonable it is to assume that we will receive the original shares back.</p>
<p>Specifically, what if some shares are <em>lost</em>, or what if some shares are <em>manipulated</em> to differ from the initially ones? Both may happen due to simple systems failure, but may also be the result of malicious behaviour on the part of shareholders. Should we in these two cases still expect to be able to recover the secret?</p>
<p>In this blog post we will see how to handle both situations. We will use simpler algorithms, but note towards the end how techniques like those used in <a href="/2017/06/24/secret-sharing-part2/">part two</a> can be used to make the process more efficient.</p>
<p>As usual, all code is available in the <a href="https://github.com/mortendahl/privateml/blob/master/secret-sharing/Reed-Solomon.ipynb">associated Python notebook</a>.</p>
<h1 id="robust-reconstruction">Robust Reconstruction</h1>
<p>In the <a href="/2017/06/04/secret-sharing-part1/#the-missing-pieces">first part</a> we saw how <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Lagrange interpolation</a> can be used to answer the first question, in that it allows us to reconstruct the secret as long as only a bounded number of shares are lost. As mentioned in the <a href="/2017/06/24/secret-sharing-part2/#polynomials">second part</a>, this is due to the redundancy that comes with point-value presentations of polynomials, namely that the original polynomial is uniquely defined by <em>any</em> large enough subset of the shares. Concretely, if <code class="language-plaintext highlighter-rouge">D</code> is the degree of the original polynomial then we can reconstruct given <code class="language-plaintext highlighter-rouge">R = D + 1</code> shares in case of Shamirâ€™s scheme and <code class="language-plaintext highlighter-rouge">R = D + K</code> shares in the packed variant; if <code class="language-plaintext highlighter-rouge">N</code> is the total number of shares we can hence afford to loose <code class="language-plaintext highlighter-rouge">N - R</code> shares.</p>
<p>But this is assuming that the received shares are unaltered, and the second question concerning recovery in the face of manipulated shares is intuitively harder as we now cannot easily identify when and where something went wrong. <i>(Note that it is also harder in a more formal sense, namely that a solution for manipulated shares can be used as a solution for lost shares, since dummy values, e.g. a constant, may be substituted for the lost shares and then instead treated as having been manipulated. This however, is not optimal.)</i></p>
<p>To solve this issue we will use techniques from error-correction codes, specifically the well-known <a href="https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction">Reed-Solomon codes</a>. The reason we can do this is that share generation is very similar to (<a href="https://en.wikipedia.org/wiki/Systematic_code">non-systemic</a>) message encoding in these codes, and hence their decoding algorithms can be used to reconstruct even in the face of manipulated shares.</p>
<p>The robust reconstruct method for Shamirâ€™s scheme we end up with is as follows, with a straight forward generalisation to the packed scheme. The input is a complete list of length <code class="language-plaintext highlighter-rouge">N</code> of received shares, where missing shares are represented by <code class="language-plaintext highlighter-rouge">None</code> and manipulated shares by their new value. And if reconstruction goes well then the output is not only the secret, but also the indices of the shares that were manipulated.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_robust_reconstruct</span><span class="p">(</span><span class="n">shares</span><span class="p">):</span>
<span class="c1"># filter missing shares
</span> <span class="n">points_values</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span><span class="n">p</span><span class="p">,</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span><span class="p">,</span><span class="n">v</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">POINTS</span><span class="p">,</span> <span class="n">shares</span><span class="p">)</span> <span class="k">if</span> <span class="n">v</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="p">]</span>
<span class="c1"># decode remaining faulty
</span> <span class="n">points</span><span class="p">,</span> <span class="n">values</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="n">points_values</span><span class="p">)</span>
<span class="n">polynomial</span><span class="p">,</span> <span class="n">error_locator</span> <span class="o">=</span> <span class="n">gao_decoding</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">,</span> <span class="n">R</span><span class="p">,</span> <span class="n">MAX_MANIPULATED</span><span class="p">)</span>
<span class="c1"># check if recovery was possible
</span> <span class="k">if</span> <span class="n">polynomial</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="c1"># there were more errors than assumed by `MAX_ERRORS`
</span> <span class="k">raise</span> <span class="nb">Exception</span><span class="p">(</span><span class="s">"Too many errors, cannot reconstruct"</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="c1"># recover secret
</span> <span class="n">secret</span> <span class="o">=</span> <span class="n">poly_eval</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="c1"># find roots of error locator polynomial
</span> <span class="n">error_indices</span> <span class="o">=</span> <span class="p">[</span> <span class="n">i</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span><span class="n">v</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span> <span class="n">poly_eval</span><span class="p">(</span><span class="n">error_locator</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">POINTS</span> <span class="p">)</span>
<span class="k">if</span> <span class="n">v</span> <span class="o">==</span> <span class="mi">0</span>
<span class="p">]</span>
<span class="k">return</span> <span class="n">secret</span><span class="p">,</span> <span class="n">error_indices</span>
</code></pre></div></div>
<p>Having the error indices may be useful for instance as a deterrent: since we can identify malicious shareholders we may also be able to e.g. publicly shame them, and hence incentivise correct behaviour in the first place. Formally this is known as <a href="https://en.wikipedia.org/wiki/Secure_multi-party_computation#Security_definitions">covert security</a>, where shareholders are willing to cheat only if they are not caught.</p>
<p>Finally note that reconstruction may however fail, yet it can be shown that this only happens when there indeed isnâ€™t enough information left to correctly identify the result; in other words, our method will never give a false negative. Parameters <code class="language-plaintext highlighter-rouge">MAX_MISSING</code> and <code class="language-plaintext highlighter-rouge">MAX_MANIPULATED</code> are used to characterise when failure can happen, giving respectively an upper bound on the number of lost and manipulated shares supported. What must hold in general is that the number of â€średundancy sharesâ€ť <code class="language-plaintext highlighter-rouge">N - R</code> must satisfy <code class="language-plaintext highlighter-rouge">N - R >= MAX_MISSING + 2 * MAX_MANIPULATED</code>, from which we see that we are paying a double price for manipulated shares compared to missing shares.</p>
<h2 id="outline-of-decoding-algorithm">Outline of decoding algorithm</h2>
<p>The specific decoding procedure we use here works by first finding an erroneous polynomial in coefficient representation that matches all received shares, including the manipulated ones. Hence we must first find a way to interpolate not only values but also coefficients from a polynomial given in point-value representation; in other words, we must find a way to convert from point-value representation to coefficient representation. We saw in <a href="/2017/06/24/secret-sharing-part2/">part two</a> how the backward FFT can do this in specific cases, but to handle missing shares we here instead adapt <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Lagrange interpolation</a> as used in <a href="/2017/06/04/secret-sharing-part1/">part one</a>.</p>
<p>Given the erroneous polynomial we then extract a corrected polynomial from it to get our desired result. Surprisingly, this may simply be done by running the <a href="https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm#Polynomial_extended_Euclidean_algorithm">extended Euclidean algorithm</a> on polynomials as shown below.</p>
<p>Finally, since both of these two steps are using polynomials as objects of computation, similarly to how one typically uses integers as objects of computation, we must first also give algorithms for polynomial arithmetic such as adding and multiplying.</p>
<h1 id="computing-on-polynomials">Computing on Polynomials</h1>
<p>We assume we already have various functions <code class="language-plaintext highlighter-rouge">base_add</code>, <code class="language-plaintext highlighter-rouge">base_sub</code>, <code class="language-plaintext highlighter-rouge">base_mul</code>, etc. for computing in the base field; concretely this simply amounts to <a href="https://en.wikipedia.org/wiki/Modular_arithmetic">integer arithmetic modulo a fixed prime</a> in our case.</p>
<p>We then represent polynomials over this base field by their list of coefficients: <code class="language-plaintext highlighter-rouge">A(x) = (a0) + (a1 * x) + ... + (aD * x^D)</code> is represented by <code class="language-plaintext highlighter-rouge">A = [a0, a1, ..., aD]</code>. Furthermore, we keep as an invariant that <code class="language-plaintext highlighter-rouge">aD != 0</code> and enforce this below through a <code class="language-plaintext highlighter-rouge">canonical</code> procedure that removes all trailing zeros.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">canonical</span><span class="p">(</span><span class="n">A</span><span class="p">):</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">reversed</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">))):</span>
<span class="k">if</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="n">A</span><span class="p">[:</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span>
<span class="k">return</span> <span class="p">[]</span>
</code></pre></div></div>
<p>However, as an intermediate step we will sometimes first need to expand one of two polynomials to ensure they have the same length. This is done by simply appending zero coefficients to the shorter list.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">expand_to_match</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">):</span>
<span class="n">diff</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">)</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)</span>
<span class="k">if</span> <span class="n">diff</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="n">A</span><span class="p">,</span> <span class="n">B</span> <span class="o">+</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">diff</span>
<span class="k">elif</span> <span class="n">diff</span> <span class="o"><</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">diff</span> <span class="o">=</span> <span class="nb">abs</span><span class="p">(</span><span class="n">diff</span><span class="p">)</span>
<span class="k">return</span> <span class="n">A</span> <span class="o">+</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">diff</span><span class="p">,</span> <span class="n">B</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">A</span><span class="p">,</span> <span class="n">B</span>
</code></pre></div></div>
<p>With this we can perform arithmetic on polynomials by simply using the <a href="https://en.wikipedia.org/wiki/Polynomial_arithmetic">standard definitions</a>. Specifically, to add two polynomials <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> given by coefficient lists <code class="language-plaintext highlighter-rouge">[a0, ..., aM]</code> and <code class="language-plaintext highlighter-rouge">[b0, ..., bN]</code> we perform component-wise addition of the coefficients <code class="language-plaintext highlighter-rouge">ai + bi</code>. For example, adding <code class="language-plaintext highlighter-rouge">A(x) = 2x + 3x^2</code> to <code class="language-plaintext highlighter-rouge">B(x) = 1 + 4x^3</code> we get <code class="language-plaintext highlighter-rouge">A(x) + B(x) = (0+1) + (2+0)x + (3+0)x^2 + (0+4)x^3</code>; the first two are represented by <code class="language-plaintext highlighter-rouge">[0,2,3]</code> and <code class="language-plaintext highlighter-rouge">[1,0,0,4]</code> respectively, and their sum by <code class="language-plaintext highlighter-rouge">[1,2,3,4]</code>. Subtraction is similarly done component-wise.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">poly_add</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">):</span>
<span class="n">F</span><span class="p">,</span> <span class="n">G</span> <span class="o">=</span> <span class="n">expand_to_match</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">)</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">([</span> <span class="n">base_add</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">g</span><span class="p">)</span> <span class="k">for</span> <span class="n">f</span><span class="p">,</span> <span class="n">g</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">F</span><span class="p">,</span> <span class="n">G</span><span class="p">)</span> <span class="p">])</span>
<span class="k">def</span> <span class="nf">poly_sub</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">):</span>
<span class="n">F</span><span class="p">,</span> <span class="n">G</span> <span class="o">=</span> <span class="n">expand_to_match</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">)</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">([</span> <span class="n">base_sub</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">g</span><span class="p">)</span> <span class="k">for</span> <span class="n">f</span><span class="p">,</span> <span class="n">g</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">F</span><span class="p">,</span> <span class="n">G</span><span class="p">)</span> <span class="p">])</span>
</code></pre></div></div>
<p>We also do scalar multiplication component-wise, i.e. by scaling every coefficient of a polynomial by an element from the base field. For instance, with <code class="language-plaintext highlighter-rouge">A(x) = 1 + 2x + 3x^2</code> we have <code class="language-plaintext highlighter-rouge">2 * A(x) = 2 + 4x + 6x^2</code>, which as expected is the same as <code class="language-plaintext highlighter-rouge">A(x) + A(x)</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">poly_scalarmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">([</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">for</span> <span class="n">a</span> <span class="ow">in</span> <span class="n">A</span> <span class="p">])</span>
<span class="k">def</span> <span class="nf">poly_scalardiv</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">([</span> <span class="n">base_div</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">for</span> <span class="n">a</span> <span class="ow">in</span> <span class="n">A</span> <span class="p">])</span>
</code></pre></div></div>
<p>Multiplication of two polynomials is only slightly more complex, with coefficient <code class="language-plaintext highlighter-rouge">cK</code> of the product being defined by <code class="language-plaintext highlighter-rouge">cK = sum( aI * bJ for i,aI in enumerate(A) for j,bJ in enumerate(B) if i + j == K )</code>, and by changing the computation slightly we avoid iterating over <code class="language-plaintext highlighter-rouge">K</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">poly_mul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">):</span>
<span class="n">C</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">)</span> <span class="o">+</span> <span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">)):</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)):</span>
<span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">base_add</span><span class="p">(</span><span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="n">j</span><span class="p">],</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">B</span><span class="p">[</span><span class="n">j</span><span class="p">]))</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">(</span><span class="n">C</span><span class="p">)</span>
</code></pre></div></div>
<p>We also need to be able to divide a polynomial <code class="language-plaintext highlighter-rouge">A</code> by another polynomial <code class="language-plaintext highlighter-rouge">B</code>, effectively finding a <em>quotient polynomial</em> <code class="language-plaintext highlighter-rouge">Q</code> and a <em>remainder polynomial</em> <code class="language-plaintext highlighter-rouge">R</code> such that <code class="language-plaintext highlighter-rouge">A == Q * B + R</code> with <code class="language-plaintext highlighter-rouge">degree(R) < degree(B)</code>. The procedure works like long-division for integers and is explained in details <a href="https://www.khanacademy.org/math/algebra2/arithmetic-with-polynomials#long-division-of-polynomials">elsewhere</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">poly_divmod</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">):</span>
<span class="n">t</span> <span class="o">=</span> <span class="n">base_inverse</span><span class="p">(</span><span class="n">lc</span><span class="p">(</span><span class="n">B</span><span class="p">))</span>
<span class="n">Q</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="n">R</span> <span class="o">=</span> <span class="n">copy</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">reversed</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">A</span><span class="p">)</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)):</span>
<span class="n">Q</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="n">R</span><span class="p">[</span><span class="n">i</span> <span class="o">+</span> <span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">])</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">B</span><span class="p">)):</span>
<span class="n">R</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">base_sub</span><span class="p">(</span><span class="n">R</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="n">j</span><span class="p">],</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">Q</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">B</span><span class="p">[</span><span class="n">j</span><span class="p">]))</span>
<span class="k">return</span> <span class="n">canonical</span><span class="p">(</span><span class="n">Q</span><span class="p">),</span> <span class="n">canonical</span><span class="p">(</span><span class="n">R</span><span class="p">)</span>
</code></pre></div></div>
<p>Note that we have used basic algorithms for these operations here but that more efficient versions exist. Some pointers to these are given at the end.</p>
<h1 id="interpolating-polynomials">Interpolating Polynomials</h1>
<p>We next turn to the task of converting a polynomial given in (implicit) point-value representation to its (explicit) coefficient representation. Several procedures exist for this, including efficient algorithms for specific cases such as the backward FFT seen earlier, and general ones based e.g. on <a href="https://en.wikipedia.org/wiki/Newton_polynomial">Newtonâ€™s method</a> that seem popular in numerical analysis due to its better efficiency and ability to handle new data points. However, for this post weâ€™ll use Lagrange interpolation and see that although itâ€™s perhaps typically see as a procedure for interpolating the values of polynomials, it also works just as well for interpolating their coefficients.</p>
<p>Recall that we are given points <code class="language-plaintext highlighter-rouge">x0, x1, ..., xD</code> and values <code class="language-plaintext highlighter-rouge">y0, y1, ..., yD</code> implicitly defining a polynomial <code class="language-plaintext highlighter-rouge">F</code>. <a href="/2017/06/04/secret-sharing-part1/">Earlier</a> we then used <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Lagrangeâ€™s method</a> to find value <code class="language-plaintext highlighter-rouge">F(x)</code> at a potentially different point <code class="language-plaintext highlighter-rouge">x</code>. This works due to the constructive nature of Lagrangeâ€™s proof, where a polynomial <code class="language-plaintext highlighter-rouge">H</code> is defined as <code class="language-plaintext highlighter-rouge">H(X) = y0 * L0(X) + ... + yD * LD(X)</code> for indeterminate <code class="language-plaintext highlighter-rouge">X</code> and <em>Lagrange basis polynomials</em> <code class="language-plaintext highlighter-rouge">Li</code>, and then shown identical to <code class="language-plaintext highlighter-rouge">F</code>. To find <code class="language-plaintext highlighter-rouge">F(x)</code> we then simply evaluated <code class="language-plaintext highlighter-rouge">H(x)</code>, although we precomputed <code class="language-plaintext highlighter-rouge">Li(x)</code> as the <em>Lagrange constants</em> <code class="language-plaintext highlighter-rouge">ci</code> so that this step simply reduced to a weighted sum <code class="language-plaintext highlighter-rouge">y1 * c1 + ... yD * cD</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">lagrange_constants_for_point</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">point</span><span class="p">):</span>
<span class="n">constants</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">xi</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">points</span><span class="p">):</span>
<span class="n">numerator</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">denominator</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">for</span> <span class="n">j</span><span class="p">,</span> <span class="n">xj</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">points</span><span class="p">):</span>
<span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="n">j</span><span class="p">:</span> <span class="k">continue</span>
<span class="n">numerator</span> <span class="o">=</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">numerator</span><span class="p">,</span> <span class="n">base_sub</span><span class="p">(</span><span class="n">point</span><span class="p">,</span> <span class="n">xj</span><span class="p">))</span>
<span class="n">denominator</span> <span class="o">=</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">denominator</span><span class="p">,</span> <span class="n">base_sub</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">xj</span><span class="p">))</span>
<span class="n">constant</span> <span class="o">=</span> <span class="n">base_div</span><span class="p">(</span><span class="n">numerator</span><span class="p">,</span> <span class="n">denominator</span><span class="p">)</span>
<span class="n">constants</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">constant</span><span class="p">)</span>
<span class="k">return</span> <span class="n">constants</span>
</code></pre></div></div>
<p>Now, when we want the coefficients of <code class="language-plaintext highlighter-rouge">F</code> instead of just its value <code class="language-plaintext highlighter-rouge">F(x)</code> at <code class="language-plaintext highlighter-rouge">x</code>, we see that while <code class="language-plaintext highlighter-rouge">H</code> is identical to <code class="language-plaintext highlighter-rouge">F</code> it only gives us a semi-explicit representation, made worse by the fact that the <code class="language-plaintext highlighter-rouge">Li</code> polynomials are also only given in a semi-explicit representation: <code class="language-plaintext highlighter-rouge">Li(X) = (X - x0) * ... * (X - xD) / (xi - x0) * ... * (xi - xD)</code>. However, since we developed algorithms for using polynomials as objects in computations, we can simply evaluate these expression with indeterminate <code class="language-plaintext highlighter-rouge">X</code> to find the reduced explicit form! See for instance the examples <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial#Examples">here</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">lagrange_polynomials</span><span class="p">(</span><span class="n">points</span><span class="p">):</span>
<span class="n">polys</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">xi</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">points</span><span class="p">):</span>
<span class="n">numerator</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">denominator</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">for</span> <span class="n">j</span><span class="p">,</span> <span class="n">xj</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">points</span><span class="p">):</span>
<span class="k">if</span> <span class="n">i</span> <span class="o">==</span> <span class="n">j</span><span class="p">:</span> <span class="k">continue</span>
<span class="n">numerator</span> <span class="o">=</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">numerator</span><span class="p">,</span> <span class="p">[</span><span class="n">base_sub</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">xj</span><span class="p">),</span> <span class="mi">1</span><span class="p">])</span>
<span class="n">denominator</span> <span class="o">=</span> <span class="n">base_mul</span><span class="p">(</span><span class="n">denominator</span><span class="p">,</span> <span class="n">base_sub</span><span class="p">(</span><span class="n">xi</span><span class="p">,</span> <span class="n">xj</span><span class="p">))</span>
<span class="n">poly</span> <span class="o">=</span> <span class="n">poly_scalardiv</span><span class="p">(</span><span class="n">numerator</span><span class="p">,</span> <span class="n">denominator</span><span class="p">)</span>
<span class="n">polys</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">poly</span><span class="p">)</span>
<span class="k">return</span> <span class="n">polys</span>
</code></pre></div></div>
<p>Doing this also for <code class="language-plaintext highlighter-rouge">H</code> gives us the interpolated polynomial in explicit coefficient representation.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">lagrange_interpolation</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">):</span>
<span class="n">ls</span> <span class="o">=</span> <span class="n">lagrange_polynomials</span><span class="p">(</span><span class="n">points</span><span class="p">)</span>
<span class="n">poly</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">yi</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">values</span><span class="p">):</span>
<span class="n">term</span> <span class="o">=</span> <span class="n">poly_scalarmul</span><span class="p">(</span><span class="n">ls</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">yi</span><span class="p">)</span>
<span class="n">poly</span> <span class="o">=</span> <span class="n">poly_add</span><span class="p">(</span><span class="n">poly</span><span class="p">,</span> <span class="n">term</span><span class="p">)</span>
<span class="k">return</span> <span class="n">poly</span>
</code></pre></div></div>
<p>While this may not be the most efficient way (see notes later), it is hard to beat its simplicity.</p>
<h1 id="correcting-errors">Correcting Errors</h1>
<p>In the non-systemic variants of <a href="https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction">Reed-Solomon codes</a>, a message <code class="language-plaintext highlighter-rouge">m</code> represented by a vector <code class="language-plaintext highlighter-rouge">[m0, ..., mD]</code> is encoded by interpreting it as a polynomial <code class="language-plaintext highlighter-rouge">F(X) = (m0) + (m1 * X) + ... + (mD * X^D)</code> and then evaluating <code class="language-plaintext highlighter-rouge">F</code> at a fixed set of points to get the code word. Unlike share generation, no randomness is used in this process since the purpose is only to provide redundancy and not privacy (in fact, in the systemic variants, the message is directly readable from the code word), yet this doesnâ€™t change the fact that we can use decoding procedures to correct errors in shares.</p>
<p>Several such <a href="https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction#Error_correction_algorithms">decoding procedures</a> exist, some of which are explained <a href="https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders">here</a> and <a href="https://jeremykun.com/2015/09/07/welch-berlekamp/">there</a>, yet the one weâ€™ll use here is conceptually simple and has a certain beauty to it. Also keep in mind that some of the typical optimizations used in implementations of the alternative approaches get their speed-up by relying on properties of the more common setting over binary extension fields, while we here are interested in the setting over prime fields as we would like to simulate (bounded) integer arithmetic in our application of secret sharing to secure computation â€“ which is straight forward in prime fields but less clear in binary extension fields.</p>
<p>The approach we will use was first described in <a href="https://doi.org/10.1016/S0019-9958(75)90090-X">SKHNâ€™75</a>, yet weâ€™ll follow the algorithm given in <a href="http://www.math.clemson.edu/~sgao/papers/RS.pdf">Gaoâ€™02</a> (see also Section 17.5 in <a href="http://shoup.net/ntb/ntb-v2.pdf">Shoupâ€™08</a>). It works by first interpolating a potentially faulty polynomial <code class="language-plaintext highlighter-rouge">H</code> from all the available shares and then running the extended Euclidean algorithm to either extract the original polynomial <code class="language-plaintext highlighter-rouge">G</code> or (rightly) declare it impossible. That the algorithm can be used for this is surprising and is strongly related to <a href="https://en.wikipedia.org/wiki/Rational_reconstruction_(mathematics)">rational reconstruction</a>.</p>
<h2 id="extended-euclidean-algorithm-on-polynomials">Extended Euclidean algorithm on polynomials</h2>
<p>Assume that we have two polynomials <code class="language-plaintext highlighter-rouge">H</code> and <code class="language-plaintext highlighter-rouge">F</code> and we would like to find linear combinations of these in the form of triples <code class="language-plaintext highlighter-rouge">(R, T, S)</code> of polynomials such that <code class="language-plaintext highlighter-rouge">R == H * T + F * S</code>. This may of course be done in many different ways, but one particular interesting approach is to consider the list of triples <code class="language-plaintext highlighter-rouge">(R0, T0, S0), ..., (RM, TM, SM)</code> generated by the <a href="https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm#Polynomial_extended_Euclidean_algorithm">extended Euclidean algorithm</a> (EEA).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">poly_eea</span><span class="p">(</span><span class="n">F</span><span class="p">,</span> <span class="n">H</span><span class="p">):</span>
<span class="n">R0</span><span class="p">,</span> <span class="n">R1</span> <span class="o">=</span> <span class="n">F</span><span class="p">,</span> <span class="n">H</span>
<span class="n">S0</span><span class="p">,</span> <span class="n">S1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="p">[]</span>
<span class="n">T0</span><span class="p">,</span> <span class="n">T1</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">triples</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">while</span> <span class="n">R1</span> <span class="o">!=</span> <span class="p">[]:</span>
<span class="n">Q</span><span class="p">,</span> <span class="n">R2</span> <span class="o">=</span> <span class="n">poly_divmod</span><span class="p">(</span><span class="n">R0</span><span class="p">,</span> <span class="n">R1</span><span class="p">)</span>
<span class="n">triples</span><span class="o">.</span><span class="n">append</span><span class="p">(</span> <span class="p">(</span><span class="n">R0</span><span class="p">,</span> <span class="n">S0</span><span class="p">,</span> <span class="n">T0</span><span class="p">)</span> <span class="p">)</span>
<span class="n">R0</span><span class="p">,</span> <span class="n">S0</span><span class="p">,</span> <span class="n">T0</span><span class="p">,</span> <span class="n">R1</span><span class="p">,</span> <span class="n">S1</span><span class="p">,</span> <span class="n">T1</span> <span class="o">=</span> \
<span class="n">R1</span><span class="p">,</span> <span class="n">S1</span><span class="p">,</span> <span class="n">T1</span><span class="p">,</span> \
<span class="n">R2</span><span class="p">,</span> <span class="n">poly_sub</span><span class="p">(</span><span class="n">S0</span><span class="p">,</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">S1</span><span class="p">,</span> <span class="n">Q</span><span class="p">)),</span> <span class="n">poly_sub</span><span class="p">(</span><span class="n">T0</span><span class="p">,</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">T1</span><span class="p">,</span> <span class="n">Q</span><span class="p">))</span>
<span class="k">return</span> <span class="n">triples</span>
</code></pre></div></div>
<p>The reason for this is that this list turns out to represent <em>all</em> triples up to a certain size that satisfy the equation, in the sense that every â€śsmallâ€ť triple <code class="language-plaintext highlighter-rouge">(R, T, S)</code> for which <code class="language-plaintext highlighter-rouge">R == T * H + S * F</code> is actually just a scaled version of a triple <code class="language-plaintext highlighter-rouge">(Ri, Ti, Si)</code> occurring in the list generated by the EEA: for some constant <code class="language-plaintext highlighter-rouge">a</code> we have <code class="language-plaintext highlighter-rouge">R == a * Ri</code>, <code class="language-plaintext highlighter-rouge">T == a * Ti</code>, and <code class="language-plaintext highlighter-rouge">S == a * Si</code>. Moreover, given a concrete interpretation of â€śsmallâ€ť in the form of a degree bound on <code class="language-plaintext highlighter-rouge">R</code> and <code class="language-plaintext highlighter-rouge">T</code>, we may find the unique <code class="language-plaintext highlighter-rouge">(Ri, Ti, Si)</code> that this holds for.</p>
<p>Why this is useful in decoding becomes apparent next.</p>
<h2 id="euclidean-decoding">Euclidean decoding</h2>
<p>Say that <code class="language-plaintext highlighter-rouge">T</code> is the unknown error locator polynomial, i.e. <code class="language-plaintext highlighter-rouge">T(xi) == 0</code> exactly when share <code class="language-plaintext highlighter-rouge">yi</code> has been manipulated. Say also that <code class="language-plaintext highlighter-rouge">R = T * G</code> where <code class="language-plaintext highlighter-rouge">G</code> is the original polynomial that was used to generate the shares. Clearly, if we actually knew <code class="language-plaintext highlighter-rouge">T</code> and <code class="language-plaintext highlighter-rouge">R</code> then we could get what weâ€™re after by a simple division <code class="language-plaintext highlighter-rouge">R / T</code> â€“ but since we donâ€™t we have to do something else.</p>
<p>Because weâ€™re only after the ratio <code class="language-plaintext highlighter-rouge">R / T</code>, we see that knowing <code class="language-plaintext highlighter-rouge">Ri</code> and <code class="language-plaintext highlighter-rouge">Ti</code> such that <code class="language-plaintext highlighter-rouge">R == a * Ri</code> and <code class="language-plaintext highlighter-rouge">T == a * Ti</code> actually gives us the same result: <code class="language-plaintext highlighter-rouge">R / T == (a * Ri) / (a * Ti) == Ri / Ti</code>, and these we could potentially get from the EEA! The only obstacles are that we need to define polynomials <code class="language-plaintext highlighter-rouge">H</code> and <code class="language-plaintext highlighter-rouge">F</code>, and we need to be sure that there is a â€śsmallâ€ť triple with the <code class="language-plaintext highlighter-rouge">R</code> and <code class="language-plaintext highlighter-rouge">T</code> as defined here that satisfies the linear equation, which in turn means making sure there exists a suitable <code class="language-plaintext highlighter-rouge">S</code>. Once done, the output of <code class="language-plaintext highlighter-rouge">poly_eea(H, F)</code> will give us the needed <code class="language-plaintext highlighter-rouge">Ri</code> and <code class="language-plaintext highlighter-rouge">Ti</code>.</p>
<p>Perhaps unsurprisingly, <code class="language-plaintext highlighter-rouge">H</code> is the polynomial interpolated using all available values, which may potentially be faulty in case some of them have been manipulated. <code class="language-plaintext highlighter-rouge">F = F1 * ... * FN</code> is the product of polynomials <code class="language-plaintext highlighter-rouge">Fi(X) = X - xi</code> where <code class="language-plaintext highlighter-rouge">X</code> it the indeterminate and <code class="language-plaintext highlighter-rouge">x1, ..., xN</code> are the points.</p>
<p>Having defined <code class="language-plaintext highlighter-rouge">H</code> and <code class="language-plaintext highlighter-rouge">F</code> like this, we can then show that our <code class="language-plaintext highlighter-rouge">R</code> and <code class="language-plaintext highlighter-rouge">T</code> as defined above are â€śsmallâ€ť when the number of errors that have occurred are below the bounds discussed earlier. Likewise it can be shown that there is an <code class="language-plaintext highlighter-rouge">S</code> such that <code class="language-plaintext highlighter-rouge">R == T * H + S * F</code>; this involves showing that <code class="language-plaintext highlighter-rouge">R - T * H == S * F</code>, which follows from <code class="language-plaintext highlighter-rouge">R == H * T mod F</code> and in turn <code class="language-plaintext highlighter-rouge">R == H * T mod Fi</code> for all <code class="language-plaintext highlighter-rouge">Fi</code>. See standard textbooks for further details.</p>
<p>With this in place we have our decoding algorithm!</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">gao_decoding</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">,</span> <span class="n">max_degree</span><span class="p">,</span> <span class="n">max_error_count</span><span class="p">):</span>
<span class="c1"># interpolate faulty polynomial
</span> <span class="n">H</span> <span class="o">=</span> <span class="n">lagrange_interpolation</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">)</span>
<span class="c1"># compute f
</span> <span class="n">F</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="k">for</span> <span class="n">xi</span> <span class="ow">in</span> <span class="n">points</span><span class="p">:</span>
<span class="n">Fi</span> <span class="o">=</span> <span class="p">[</span><span class="n">base_sub</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">xi</span><span class="p">),</span> <span class="mi">1</span><span class="p">]</span>
<span class="n">F</span> <span class="o">=</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">F</span><span class="p">,</span> <span class="n">Fi</span><span class="p">)</span>
<span class="c1"># run EEA-like algorithm on (F,H) to find EEA triple
</span> <span class="n">R0</span><span class="p">,</span> <span class="n">R1</span> <span class="o">=</span> <span class="n">F</span><span class="p">,</span> <span class="n">H</span>
<span class="n">S0</span><span class="p">,</span> <span class="n">S1</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="p">[]</span>
<span class="n">T0</span><span class="p">,</span> <span class="n">T1</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">Q</span><span class="p">,</span> <span class="n">R2</span> <span class="o">=</span> <span class="n">poly_divmod</span><span class="p">(</span><span class="n">R0</span><span class="p">,</span> <span class="n">R1</span><span class="p">)</span>
<span class="k">if</span> <span class="n">deg</span><span class="p">(</span><span class="n">R0</span><span class="p">)</span> <span class="o"><</span> <span class="n">max_degree</span> <span class="o">+</span> <span class="n">max_error_count</span><span class="p">:</span>
<span class="n">G</span><span class="p">,</span> <span class="n">leftover</span> <span class="o">=</span> <span class="n">poly_divmod</span><span class="p">(</span><span class="n">R0</span><span class="p">,</span> <span class="n">T0</span><span class="p">)</span>
<span class="k">if</span> <span class="n">leftover</span> <span class="o">==</span> <span class="p">[]:</span>
<span class="n">decoded_polynomial</span> <span class="o">=</span> <span class="n">G</span>
<span class="n">error_locator</span> <span class="o">=</span> <span class="n">T0</span>
<span class="k">return</span> <span class="n">decoded_polynomial</span><span class="p">,</span> <span class="n">error_locator</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">None</span>
<span class="n">R0</span><span class="p">,</span> <span class="n">S0</span><span class="p">,</span> <span class="n">T0</span><span class="p">,</span> <span class="n">R1</span><span class="p">,</span> <span class="n">S1</span><span class="p">,</span> <span class="n">T1</span> <span class="o">=</span> \
<span class="n">R1</span><span class="p">,</span> <span class="n">S1</span><span class="p">,</span> <span class="n">T1</span><span class="p">,</span> \
<span class="n">R2</span><span class="p">,</span> <span class="n">poly_sub</span><span class="p">(</span><span class="n">S0</span><span class="p">,</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">S1</span><span class="p">,</span> <span class="n">Q</span><span class="p">)),</span> <span class="n">poly_sub</span><span class="p">(</span><span class="n">T0</span><span class="p">,</span> <span class="n">poly_mul</span><span class="p">(</span><span class="n">T1</span><span class="p">,</span> <span class="n">Q</span><span class="p">))</span>
</code></pre></div></div>
<p>Note however that it actually does more than promised above: it breaks down gracefully, by returning <code class="language-plaintext highlighter-rouge">None</code> instead of a wrong result, in case our assumption on the maximum number of errors turns out to be false. The intuition behind this is that if the assumption is true then <code class="language-plaintext highlighter-rouge">T</code> by definition is â€śsmallâ€ť and hence the properties of the EEA triple kick in to imply that the division is the same as <code class="language-plaintext highlighter-rouge">R / T</code>, which by definition of <code class="language-plaintext highlighter-rouge">R</code> has a zero remainder. And vice versa, if the remainder was zero then the returned polynomial is in fact less than the assumed number of errors away from <code class="language-plaintext highlighter-rouge">H</code> and hence <code class="language-plaintext highlighter-rouge">T</code> by definition is â€śsmallâ€ť. In other words, <code class="language-plaintext highlighter-rouge">None</code> is returned if and only if our assumption was false, which is pretty neat. See <a href="http://www.math.clemson.edu/~sgao/papers/RS.pdf">Gaoâ€™02</a> for further details.</p>
<p>Finally, note that it also gives us the error locations in the form of the roots of <code class="language-plaintext highlighter-rouge">T</code>. As mentioned earlier this is very useful from an application point of view, but could also have been obtained by simply comparing the received shares against a re-sharing based on the decoded polynomial.</p>
<h1 id="efficiency-improvements">Efficiency Improvements</h1>
<p>The algorithms presented above have time complexity <code class="language-plaintext highlighter-rouge">Oh(N^2)</code> but are not the most efficient. Based on the <a href="/2017/06/24/secret-sharing-part2/">second part</a> we may straight away see how interpolation can be sped up by using the <a href="https://en.wikipedia.org/wiki/Fast_Fourier_transform">Fast Fourier Transform</a> instead of Lagrangeâ€™s method. One downside is that we then need to assume that <code class="language-plaintext highlighter-rouge">x1, ..., xN</code> are Fourier points, i.e. with a special structure, and we need to fill in dummy values for the missing shares and hence pay the double price. <a href="https://en.wikipedia.org/wiki/Newton_polynomial">Newtonâ€™s method</a> alternatively avoids this constraint while potentially giving better concrete performance than Lagrangeâ€™s.</p>
<p>However, there are also other fast interpolation algorithms without these constraints, as detailed in for instance Modern Computer Algebra or <a href="http://cr.yp.to/f2mult/mateer-thesis.pdf">this thesis</a>, which also reduces the asymptotic complexity to <code class="language-plaintext highlighter-rouge">Oh(N * log N)</code>. This former reference also contains fast <code class="language-plaintext highlighter-rouge">Oh(N * log N)</code> methods for arithmetic and the EEA.</p>
<h1 id="next-steps">Next Steps</h1>
<p>The first three posts have been a lot of theory and itâ€™s now time to turn to applications.</p>Morten DahlTL;DR: due to redundancy in the way shares are generated, we can compensate not only for some of them being lost but also for some being manipulated; here we look at how to do this using decoding methods for Reed-Solomon codes.Recent Talks on Privacy2017-08-12T12:00:00+00:002017-08-12T12:00:00+00:00https://mortendahl.github.io/2017/08/12/recent-talks-on-privacy<p>During winter and spring I was fortunate enough to have a few occasions to talk about some of the work done at <a href="https://snips.ai">Snips</a> on applying <a href="https://en.wikipedia.org/wiki/Privacy-enhancing_technologies">privacy-enhancing technologies</a> in a start-up building privacy-aware machine learning systems for mobile devices.</p>
<p>These were mainly centered around the <a href="https://github.com/snipsco/sda"><em>Secure Distributed Aggregator</em></a> (SDA) for learning from user data distributed on mobile devices in a privacy-preserving manner, i.e. without learning any individual data only the final aggregation, but there was also room for discussion around privacy from a broader perspective, including how it has played into decisions made by the company.</p>
<h1 id="what-privacy-has-meant-for-snips">What Privacy Has Meant For Snips</h1>
<p>Given at the workshop on <a href="http://wwwf.imperial.ac.uk/~nadams/events/ic-rss2017/ic-rss2017.html"><em>Privacy in Statistical Analysis (PSAâ€™17)</em></a>, this invited <a href="https://github.com/mortendahl/privateml/raw/master/talks/PSA17-slides.pdf">talk</a> aimed at giving an industrial perspective on privacy, including how it has played a role at Snips from its beginning. To this end the talk was divided into four areas where privacy had been involved, three of which briefly discussed below.</p>
<h3 id="accessing-data">Accessing Data</h3>
<p>Access to personal data was essential for the success of its first mobile app, so to ensure that this was given the company decided to earn usersâ€™ trust by focusing on privacy. To this end, it was decided to keep all data locally on usersâ€™ devices and do the processing there instead of on company servers.</p>
<p>These on-device privacy solutions have the extra benefit of being easy to explain, and may have accounted for the high percentage of users willing to give the mobile app access to sensitive information such as emails, chats, location tracking, and even screen content.</p>
<h3 id="protecting-the-company">Protecting the Company</h3>
<p>By the principle of <a href="https://www.schneier.com/blog/archives/2016/03/data_is_a_toxic.html"><em>Data is a Toxic Asset</em></a>, not storing any user data means less to worry about if company servers are ever compromised. However, some services hosted by third parties, including the company, may build up a set of metadata that in itself could reveal something about the users and e.g. damage reputation. One such example is <em>point-of-interest</em> services where a user reveals his location in order to obtain e.g. a list of nearby restaurants.</p>
<p>Powerful cryptographic techniques, such as the <a href="https://www.torproject.org/">Tor network</a> and <a href="https://en.wikipedia.org/wiki/Private_information_retrieval">private information retrieval</a>, may make it possible for companies to make private versions of these services, yet also impose a significant overhead. Instead, by assuming that the company is generally honest, a more efficient compromise can be reached by shifting the focus from deliberate malicious behaviour to easier problems such as accidental storing or logging.</p>
<p>One concrete approach taken for this was to strip sensitive information at the server entry point so that it was never exposed to subcomponents.</p>
<h3 id="learning-from-data">Learning from Data</h3>
<p>While it is great for user privacy to only have locally stored data sets, it is also relevant for both users and the company to get insights from these, for instance as a way of making cross-user recommendations or getting model feedback.</p>
<p>The key to this contradiction is that often there is no need to share individual data as long as a global view can be computed. A brief comparison between techniques was made, including:</p>
<ul>
<li>
<p><strong>sensor networks</strong>: high performance but requires a lot of coordination between users</p>
</li>
<li>
<p><strong>differential privacy</strong>: high performance and strong privacy guarantees, but a lot of data is needed for the signal to overcome the noise</p>
</li>
<li>
<p><strong>homomorphic encryption</strong>: flexible and explainable, but still not very efficient and has the issue of whoâ€™s holding the decryption keys</p>
</li>
<li>
<p><strong>multi-party computation</strong>: flexible and decent performance, but requires several players to distribute trust to</p>
</li>
</ul>
<p>and concluding with the specialised multi-party computation protocol underlying SDA and further detailed below.</p>
<h1 id="private-data-aggregation-on-a-budget">Private Data Aggregation on a Budget</h1>
<p>Given at the workshop on <a href="http://www.multipartycomputation.com/tpmpc-2017"><em>Theory and Practice of Multi-Party Computation (TPMPCâ€™17)</em></a>, this <a href="https://github.com/mortendahl/privateml/raw/master/talks/TPMPC17-slides.pdf">talk</a> was technical in nature in that it presented the <a href="https://eprint.iacr.org/2017/643">SDA protocol</a>, but also aimed at illustrating the problem that a company may experience when wanting to solve a privacy problem by employing a secure multi-party computation (MPC) protocol: namely, that it may find itself to be the only party that is naturally motivated to invest resources into it.</p>
<p>Moreover, to remain open to as many potential other parties as possible, it is interesting to minimise the requirements on these in terms of computation, communication, and coordination. By doing so parties running e.g. mobile devices or web browsers may be considered. These concerns however, are not always considered in typical MPC protocols.</p>
<h3 id="community-based-mpc">Community-based MPC</h3>
<p>To this end SDA presents a simple but concrete proposal in a <em>community-based model</em> where members from a community are used as parties.</p>
<p>These parties only have to make a minimum of investment as most of the computation is out-sourced to the company and very little coordination is required between the selected members. Furthermore, a mechanism for distributing work is also presented that allows for lowering the individual load by involving more members.</p>
<p>The result is a practical protocol for <em>aggregating high-dimensional vectors</em> that is suitable for a single company with a community of sporadic members.</p>
<h3 id="applications">Applications</h3>
<p>Concrete and realistic applications was also considered, including analytics, surveys, and place discovery based on usersâ€™ location history.</p>
<p>As illustrated, the load on community members in these applications were low enough to be reasonably run on mobile phones and even web browsers.</p>
<p>This work was also presented at <a href="https://pmpml.github.io/PMPML16/"><em>Private Multi-Party Machine Learning (PMPMLâ€™16)</em></a> in the form of a <a href="https://github.com/mortendahl/privateml/raw/master/talks/PMPML16-poster.pdf">poster</a>.</p>Morten DahlDuring winter and spring I was fortunate enough to have a few occasions to talk about some of the work done at Snips on applying privacy-enhancing technologies in a start-up building privacy-aware machine learning systems for mobile devices.Secret Sharing, Part 22017-06-24T12:00:00+00:002017-06-24T12:00:00+00:00https://mortendahl.github.io/2017/06/24/secret-sharing-part2<p><em><strong>TL;DR:</strong> efficient secret sharing requires fast polynomial evaluation and interpolation; here we go through what it takes to use the well-known Fast Fourier Transform for this.</em></p>
<p>In the <a href="/2017/06/04/secret-sharing-part1/">first part</a> we looked at Shamirâ€™s scheme, as well as its packed variant where several secrets are shared together. We saw that polynomials lie at the core of both schemes, and that implementation is basically a question of (partially) converting back and forth between two different representations of these. We also gave typical algorithms for doing this.</p>
<p>For this part we will look at somewhat more complex algorithms in an attempt to speed up the computations needed for generating shares. Specifically, we will implement and apply the Fast Fourier Transform, detailing all the essential steps. Performance measurements performed with <a href="https://github.com/mortendahl/rust-threshold-secret-sharing">our Rust implementation</a> shows that this yields orders of magnitude of efficiency improvements when either the number of shares or the number of secrets is high.</p>
<p>There is also an <a href="https://github.com/mortendahl/privateml/blob/master/secret-sharing/Fast%20Fourier%20Transform.ipynb">associated Python notebook</a> to better see how the code samples fit together in the bigger picture.</p>
<h1 id="polynomials">Polynomials</h1>
<p>If we <a href="/2017/06/04/secret-sharing-part1/">look back</a> at Shamirâ€™s scheme we see that itâ€™s all about polynomials: a random polynomial embedding the secret is sampled and the shares are taken as its values at a certain set of points.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_share</span><span class="p">(</span><span class="n">secret</span><span class="p">):</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="n">sample_shamir_polynomial</span><span class="p">(</span><span class="n">secret</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="n">evaluate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SHARE_POINTS</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>The same goes for the packed variant, where several secrets are embedded in the sampled polynomial.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">packed_share</span><span class="p">(</span><span class="n">secrets</span><span class="p">):</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="n">sample_packed_polynomial</span><span class="p">(</span><span class="n">secrets</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="n">interpolate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SHARE_POINTS</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>Notice however that they differ slightly in the second steps where the shares are computed: Shamirâ€™s scheme uses <code class="language-plaintext highlighter-rouge">evaluate_at_point</code> while the packed uses <code class="language-plaintext highlighter-rouge">interpolate_at_point</code>. The reason is that the sampled polynomial in the former case is in <em>coefficient representation</em> while in the latter it is in <em>point-value representation</em>.</p>
<p>Specifically, we often represent a polynomial <code class="language-plaintext highlighter-rouge">f</code> of degree <code class="language-plaintext highlighter-rouge">D == L-1</code> by a list of <code class="language-plaintext highlighter-rouge">L</code> coefficients <code class="language-plaintext highlighter-rouge">a0, ..., aD</code> such that <code class="language-plaintext highlighter-rouge">f(x) = (a0) + (a1 * x) + (a2 * x^2) + ... + (aD * x^D)</code>. This representation is convenient for many things, including efficiently evaluating the polynomial at a given point using e.g. <a href="https://en.wikipedia.org/wiki/Horner%27s_method">Hornerâ€™s method</a>.</p>
<p>However, every such polynomial may also be represented by a set of <code class="language-plaintext highlighter-rouge">L</code> point-value pairs <code class="language-plaintext highlighter-rouge">(p1, v1), ..., (pL, vL)</code> where <code class="language-plaintext highlighter-rouge">vi == f(pi)</code> and all the <code class="language-plaintext highlighter-rouge">pi</code> are distinct. Evaluating the polynomial at a given point is still possible, yet now requires a more involved <em>interpolation</em> procedure that may be less efficient.</p>
<p>But the point-value representation also has several advantages, most importantly that every element intuitively contributes with the same amount of information, unlike the coefficient representation where, in the case of secret sharing, a few elements are the actual secrets; this property gives us the privacy guarantee we are after. Moreover, a degree <code class="language-plaintext highlighter-rouge">L-1</code> polynomial may also be represented by <em>more than</em> <code class="language-plaintext highlighter-rouge">L</code> pairs; in this case there is some redundancy in the representation that we may for instance take advantage of in secret sharing (to reconstruct even if some shares are lost) and in coding theory (to decode correctly even if some errors occur during transmission).</p>
<p>The reason this works is that the result of interpolation on a point-value representation with <code class="language-plaintext highlighter-rouge">L</code> pairs is technically speaking defined with respect to the <em>least degree</em> polynomial <code class="language-plaintext highlighter-rouge">g</code> such that <code class="language-plaintext highlighter-rouge">g(pi) == vi</code> for all pairs in the set, which is <a href="https://en.wikipedia.org/wiki/Polynomial_interpolation#Uniqueness_of_the_interpolating_polynomial">unique</a> and has at most degree <code class="language-plaintext highlighter-rouge">L-1</code>. This means that if two point-value representations are generated using the same polynomial <code class="language-plaintext highlighter-rouge">g</code> then interpolation on these will yield identical results, even when the two sets are of different sizes or use different points, since the least degree polynomial is the same.</p>
<p>It is also why we can use the two representations somewhat interchangeably: if a point-value representation with <code class="language-plaintext highlighter-rouge">L</code> pairs where generated by a degree <code class="language-plaintext highlighter-rouge">L-1</code> polynomial <code class="language-plaintext highlighter-rouge">f</code>, then the unique least degree polynomial agreeing with these must be <code class="language-plaintext highlighter-rouge">f</code>. And since, for a fixed set of points, the set of coefficient lists of length <code class="language-plaintext highlighter-rouge">L</code> and the set of value lists of length <code class="language-plaintext highlighter-rouge">L</code> has the same cardinality (in our case <code class="language-plaintext highlighter-rouge">Q^L</code>) we must have a bijection between them.</p>
<h1 id="fast-fourier-transform">Fast Fourier Transform</h1>
<p>With the two presentation of polynomials in mind we move on to how the <a href="https://en.wikipedia.org/wiki/Fast_Fourier_transform">Fast Fourier Transform</a> (FFT) over finite fields â€“ <em>also known as the <a href="https://en.wikipedia.org/wiki/Discrete_Fourier_transform_(general)#Number-theoretic_transform">Number Theoretic Transform</a> (NTT)</em> â€“ can be used to perform efficient conversion between them. And for me the best way of understanding this is through an example that can later be generalised into an algorithm.</p>
<h2 id="walk-through-example">Walk-through example</h2>
<p>Recall that all our computations happen in a prime field determined by a fixed prime <code class="language-plaintext highlighter-rouge">Q</code>, i.e. using the numbers <code class="language-plaintext highlighter-rouge">0, 1, ..., Q-1</code>. In this example we will use <code class="language-plaintext highlighter-rouge">Q = 433</code>, whoâ€™s order <code class="language-plaintext highlighter-rouge">Q-1</code> is divisible by <code class="language-plaintext highlighter-rouge">4</code>: <code class="language-plaintext highlighter-rouge">Q-1 == 432 == 4 * k</code> with <code class="language-plaintext highlighter-rouge">k = 108</code>.</p>
<p>Assume then that we have a polynomial <code class="language-plaintext highlighter-rouge">A(x) = 1 + 2x + 3x^2 + 4x^3</code> over this field of with <code class="language-plaintext highlighter-rouge">L == 4</code> coefficients and degree <code class="language-plaintext highlighter-rouge">L-1 == 3</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">A_coeffs</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span> <span class="p">]</span>
</code></pre></div></div>
<p>Our goal is to turn this list of coefficients into a list of values <code class="language-plaintext highlighter-rouge">[ A(w0), A(w1), A(w2), A(w3) ]</code> of equal length, for points <code class="language-plaintext highlighter-rouge">w = [w0, w1, w2, w3]</code>.</p>
<p>The standard way of evaluating polynomials is of course one way of during this, which using Hornerâ€™s rule can be done in a total of <code class="language-plaintext highlighter-rouge">Oh(L * L)</code> operations.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">A</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">horner_evaluate</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
<span class="k">assert</span><span class="p">([</span> <span class="n">A</span><span class="p">(</span><span class="n">wi</span><span class="p">)</span> <span class="k">for</span> <span class="n">wi</span> <span class="ow">in</span> <span class="n">w</span> <span class="p">]</span>
<span class="o">==</span> <span class="p">[</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">73</span><span class="p">,</span> <span class="mi">431</span><span class="p">,</span> <span class="mi">356</span> <span class="p">])</span>
</code></pre></div></div>
<p>But as we will see, the FFT allows us to do so more efficiently when the length is sufficiently large and the points are chosen with a certain structure; asymptotically we can compute the values in <code class="language-plaintext highlighter-rouge">Oh(L * log L)</code> operations.</p>
<p>The first insight we need is that there is an alternative evaluation strategy that breaks <code class="language-plaintext highlighter-rouge">A</code> into two smaller polynomials. In particular, if we define polynomials <code class="language-plaintext highlighter-rouge">B(y) = 1 + 3y</code> and <code class="language-plaintext highlighter-rouge">C(y) = 2 + 4y</code> by taking every other coefficient from <code class="language-plaintext highlighter-rouge">A</code> then we have <code class="language-plaintext highlighter-rouge">A(x) == B(x * x) + x * C(x * x)</code>, which is straight-forward to verify by simply writing out the right-hand side.</p>
<p>This means that if we know values of <code class="language-plaintext highlighter-rouge">B(y)</code> and <code class="language-plaintext highlighter-rouge">C(y)</code> at the <em>squares</em> <code class="language-plaintext highlighter-rouge">v</code> of the <code class="language-plaintext highlighter-rouge">w</code> points, then we can use these to compute the values of <code class="language-plaintext highlighter-rouge">A(x)</code> at the <code class="language-plaintext highlighter-rouge">w</code> points using table look-ups: <code class="language-plaintext highlighter-rouge">A_values[i] = B_values[i] + w[i] * C_values[i]</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># split A into B and C
</span><span class="n">B_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">0</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> <span class="c1"># == [ 1, 3, ]
</span><span class="n">C_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> <span class="c1"># == [ 2, 4 ]
</span>
<span class="c1"># square the w points
</span><span class="n">v</span> <span class="o">=</span> <span class="p">[</span> <span class="n">wi</span> <span class="o">*</span> <span class="n">wi</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">wi</span> <span class="ow">in</span> <span class="n">w</span> <span class="p">]</span>
<span class="c1"># somehow compute the values of B and C at the v points
# ...
</span><span class="k">assert</span><span class="p">(</span> <span class="n">B_values</span> <span class="o">==</span> <span class="p">[</span> <span class="n">B</span><span class="p">(</span><span class="n">vi</span><span class="p">)</span> <span class="k">for</span> <span class="n">vi</span> <span class="ow">in</span> <span class="n">v</span> <span class="p">]</span> <span class="p">)</span>
<span class="k">assert</span><span class="p">(</span> <span class="n">C_values</span> <span class="o">==</span> <span class="p">[</span> <span class="n">C</span><span class="p">(</span><span class="n">vi</span><span class="p">)</span> <span class="k">for</span> <span class="n">vi</span> <span class="ow">in</span> <span class="n">v</span> <span class="p">]</span> <span class="p">)</span>
<span class="c1"># combine results into values of A at the w points
</span><span class="n">A_values</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span> <span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">w</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="p">)</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span><span class="n">_</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">w</span><span class="p">)</span> <span class="p">]</span>
<span class="k">assert</span><span class="p">(</span> <span class="n">A_values</span> <span class="o">==</span> <span class="p">[</span> <span class="n">A</span><span class="p">(</span><span class="n">wi</span><span class="p">)</span> <span class="k">for</span> <span class="n">wi</span> <span class="ow">in</span> <span class="n">w</span> <span class="p">]</span> <span class="p">)</span>
</code></pre></div></div>
<p>So far we havenâ€™t saved much, but the second insight fixes that: by picking the points <code class="language-plaintext highlighter-rouge">w</code> to be the elements of a subgroup of order 4, the <code class="language-plaintext highlighter-rouge">v</code> points used for <code class="language-plaintext highlighter-rouge">B</code> and <code class="language-plaintext highlighter-rouge">C</code> will form a subgroup of order 2 due to the squaring; hence, we will have <code class="language-plaintext highlighter-rouge">v[0] == v[2]</code> and <code class="language-plaintext highlighter-rouge">v[1] == v[3]</code> and so only need the first halves of <code class="language-plaintext highlighter-rouge">B_values</code> and <code class="language-plaintext highlighter-rouge">C_values</code> â€“ as such we have cut the subproblems in half!</p>
<p>Such subgroups are typically characterized by a generator, i.e. an element of the field that when raised to powers will take on exactly the values of the subgroup elements. Historically such generators are denoted by the omega symbol so letâ€™s follow that convention here as well.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># generator of subgroup of order 4
</span><span class="n">omega4</span> <span class="o">=</span> <span class="mi">179</span>
<span class="n">w</span> <span class="o">=</span> <span class="p">[</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega4</span><span class="p">,</span> <span class="n">e</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span> <span class="p">]</span>
<span class="k">assert</span><span class="p">(</span> <span class="n">w</span> <span class="o">==</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">179</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">254</span><span class="p">]</span> <span class="p">)</span>
</code></pre></div></div>
<p>We shall return to how to find such generator below, but note that once we know one of order 4 then itâ€™s easy to find one of order 2: we simply square.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># generator of subgroup of order 2
</span><span class="n">omega2</span> <span class="o">=</span> <span class="n">omega4</span> <span class="o">*</span> <span class="n">omega4</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">v</span> <span class="o">=</span> <span class="p">[</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega2</span><span class="p">,</span> <span class="n">e</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="p">]</span>
<span class="k">assert</span><span class="p">(</span> <span class="n">v</span> <span class="o">==</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">432</span><span class="p">]</span> <span class="p">)</span>
</code></pre></div></div>
<p>As a quick test we may also check that the orders are indeed as claimed. Specifically, if we keep raising <code class="language-plaintext highlighter-rouge">omega4</code> to higher powers then we except to keep visiting the same four numbers, and likewise we expect to keep visiting the same two numbers for <code class="language-plaintext highlighter-rouge">omega2</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">assert</span><span class="p">(</span> <span class="p">[</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega4</span><span class="p">,</span> <span class="n">e</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span> <span class="p">]</span> <span class="o">==</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">179</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">254</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">179</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">254</span><span class="p">]</span> <span class="p">)</span>
<span class="k">assert</span><span class="p">(</span> <span class="p">[</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega2</span><span class="p">,</span> <span class="n">e</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span> <span class="p">]</span> <span class="o">==</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">432</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">432</span><span class="p">]</span> <span class="p">)</span>
</code></pre></div></div>
<p>Using generators we also see that there is no need to explicitly calculate the lists <code class="language-plaintext highlighter-rouge">w</code> and <code class="language-plaintext highlighter-rouge">v</code> anymore as they are now implicitly defined by the generator. So, with these change we come back to our mission of computing the values of <code class="language-plaintext highlighter-rouge">A</code> at the points determined by the powers of <code class="language-plaintext highlighter-rouge">omega4</code>, which may then be done via <code class="language-plaintext highlighter-rouge">A_values[i] = B_values[i % 2] + pow(omega4, i, Q) * C_values[i % 2]</code>.</p>
<p>The third and final insight we need is that we can of course continue this process of diving the polynomial in half: to compute e.g. <code class="language-plaintext highlighter-rouge">B_values</code> we break <code class="language-plaintext highlighter-rouge">B</code> into two polynomials <code class="language-plaintext highlighter-rouge">D</code> and <code class="language-plaintext highlighter-rouge">E</code> and then follow the same procedure; in this case <code class="language-plaintext highlighter-rouge">D</code> and <code class="language-plaintext highlighter-rouge">E</code> will be simple constants but it works in the general case as well. The only requirement is that the length <code class="language-plaintext highlighter-rouge">L</code> is a power of 2 and that we can find a generator <code class="language-plaintext highlighter-rouge">omegaL</code> of a subgroup of this size.</p>
<h2 id="algorithm-for-powers-of-2">Algorithm for powers of 2</h2>
<p>Putting the above into an algorithm we get the following, where <code class="language-plaintext highlighter-rouge">omega</code> is assumed to be a generator of order <code class="language-plaintext highlighter-rouge">len(A_coeffs)</code>. Note that some typical optimizations are omitted for clarity (but see e.g. <a href="https://github.com/mortendahl/privateml/blob/master/secret-sharing/Fast%20Fourier%20Transform.ipynb">the Python notebook</a>).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fft2_forward</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">,</span> <span class="n">omega</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="k">return</span> <span class="n">A_coeffs</span>
<span class="c1"># split A into B and C such that A(x) = B(x^2) + x * C(x^2)
</span> <span class="n">B_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">0</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span>
<span class="n">C_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span>
<span class="c1"># apply recursively
</span> <span class="n">omega_squared</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">B_values</span> <span class="o">=</span> <span class="n">fft2_forward</span><span class="p">(</span><span class="n">B_coeffs</span><span class="p">,</span> <span class="n">omega_squared</span><span class="p">)</span>
<span class="n">C_values</span> <span class="o">=</span> <span class="n">fft2_forward</span><span class="p">(</span><span class="n">C_coeffs</span><span class="p">,</span> <span class="n">omega_squared</span><span class="p">)</span>
<span class="c1"># combine subresults
</span> <span class="n">A_values</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span>
<span class="n">L_half</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">L_half</span><span class="p">):</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">i</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">A_values</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="n">L_half</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">A_values</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">A_values</span>
</code></pre></div></div>
<p>With this procedure we may convert a polynomial in coefficient form to its point-value form, i.e. evaluate the polynomial, in <code class="language-plaintext highlighter-rouge">Oh(L * log L)</code> operations.</p>
<p>The freedom we gave up to achieve this is that the number of coefficients <code class="language-plaintext highlighter-rouge">L</code> must now be a power of 2; but of course, some of the them may be zero so we are still free to choose the degree of the polynomial as we wish up to <code class="language-plaintext highlighter-rouge">L-1</code>. Also, we are no longer free to choose any set of evaluation points but have to choose a set with a certain subgroup structure.</p>
<p>Finally, it turns out that we can also use the above procedure to go in the opposite direction from point-value form to coefficient form, i.e. interpolate the least degree polynomial. We see that this is simply done by essentially treating the values as coefficients followed by a scaling, but wonâ€™t go into the details here.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fft2_backward</span><span class="p">(</span><span class="n">A_values</span><span class="p">,</span> <span class="n">omega</span><span class="p">):</span>
<span class="n">L_inv</span> <span class="o">=</span> <span class="n">inverse</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">A_values</span><span class="p">))</span>
<span class="n">A_coeffs</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">L_inv</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">a</span> <span class="ow">in</span> <span class="n">fft2_forward</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">inverse</span><span class="p">(</span><span class="n">omega</span><span class="p">))</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">A_coeffs</span>
</code></pre></div></div>
<p>Here however we may feel a stronger impact of the constraints implied by the FFT: while we can use zero coefficients to â€śpatch upâ€ť the coefficient representation of a lower degree polynomial to make its length match our target length <code class="language-plaintext highlighter-rouge">L</code> but keeping its identity, we cannot simply add e.g. zero pairs to a point-value representation as it may change the implicit least degree polynomial; as we will see in the next blog post this has implications for our application to secret sharing if we also want to use the FFT for reconstruction.</p>
<h2 id="algorithm-for-powers-of-3">Algorithm for powers of 3</h2>
<p>Unsurprisingly there is nothing in the principles behind the FFT that means it will only work for powers of 2, and other bases can indeed be used as well. Luckily perhaps, since this plays a big part in our application to secret sharing as we will see below.</p>
<p>To adapt the FFT algorithm to powers of 3 we instead assume that the list of coefficients of <code class="language-plaintext highlighter-rouge">A</code> has such a length, and split it into three polynomials <code class="language-plaintext highlighter-rouge">B</code>, <code class="language-plaintext highlighter-rouge">C</code>, and <code class="language-plaintext highlighter-rouge">D</code> such that <code class="language-plaintext highlighter-rouge">A(x) = B(x^3) + x * C(x^3) + x^2 * D(x^3)</code>, and we use the cube of <code class="language-plaintext highlighter-rouge">omega</code> in the recursive calls instead of the square. Here <code class="language-plaintext highlighter-rouge">omega</code> is again assumed be a generator of order <code class="language-plaintext highlighter-rouge">len(A_coeffs)</code>, but this time a power of 3.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fft3_forward</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">,</span> <span class="n">omega</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="k">return</span> <span class="n">A_coeffs</span>
<span class="c1"># split A into B, C, and D such that A(x) = B(x^3) + x * C(x^3) + x^2 * D(x^3)
</span> <span class="n">B_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">0</span><span class="p">::</span><span class="mi">3</span><span class="p">]</span>
<span class="n">B_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">1</span><span class="p">::</span><span class="mi">3</span><span class="p">]</span>
<span class="n">B_coeffs</span> <span class="o">=</span> <span class="n">A_coeffs</span><span class="p">[</span><span class="mi">2</span><span class="p">::</span><span class="mi">3</span><span class="p">]</span>
<span class="c1"># apply recursively
</span> <span class="n">omega_cubed</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">B_values</span> <span class="o">=</span> <span class="n">fft3_forward</span><span class="p">(</span><span class="n">B_coeffs</span><span class="p">,</span> <span class="n">omega_cubed</span><span class="p">)</span>
<span class="n">C_values</span> <span class="o">=</span> <span class="n">fft3_forward</span><span class="p">(</span><span class="n">B_coeffs</span><span class="p">,</span> <span class="n">omega_cubed</span><span class="p">)</span>
<span class="n">D_values</span> <span class="o">=</span> <span class="n">fft3_forward</span><span class="p">(</span><span class="n">B_coeffs</span><span class="p">,</span> <span class="n">omega_cubed</span><span class="p">)</span>
<span class="c1"># combine subresults
</span> <span class="n">A_values</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span>
<span class="n">L_third</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">A_coeffs</span><span class="p">)</span> <span class="o">//</span> <span class="mi">3</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">L_third</span><span class="p">):</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">i</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">xx</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">A_values</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">xx</span> <span class="o">*</span> <span class="n">D_values</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="n">L_third</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">xx</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">A_values</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">xx</span> <span class="o">*</span> <span class="n">D_values</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="n">L_third</span> <span class="o">+</span> <span class="n">L_third</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">omega</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">xx</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">Q</span><span class="p">)</span>
<span class="n">A_values</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">B_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span> <span class="o">*</span> <span class="n">C_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">xx</span> <span class="o">*</span> <span class="n">D_values</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">A_values</span>
</code></pre></div></div>
<p>And again we may go in the opposite direction and perform interpolation by simply treating the values as coefficients and performing a scaling.</p>
<h2 id="optimizations">Optimizations</h2>
<p>For easy of presentation we have omitted some typical optimizations here, perhaps most typically the fact that for powers of 2 we have the property that <code class="language-plaintext highlighter-rouge">pow(omega, i, Q) == -pow(omega, i + L/2, Q)</code>, meaning we can cut the number of exponentiations in <code class="language-plaintext highlighter-rouge">fft2</code> in half compared to what we did above.</p>
<p>More interestingly, the FFTs can be also run in-place and hence reusing the list in which the input is provided. This saves memory allocations and has a significant impact on performance. Likewise, we may gain improvements by switching to another number representation such as <a href="https://en.wikipedia.org/wiki/Montgomery_modular_multiplication">Montgomery form</a>. Both of these approaches are described in further detail <a href="https://medium.com/snips-ai/optimizing-threshold-secret-sharing-c877901231e5">elsewhere</a>.</p>
<h1 id="application-to-secret-sharing">Application to Secret Sharing</h1>
<p>We can now return to applying the FFT to the secret sharing schemes. As mentioned earlier, using this instead of the more traditional approaches makes most sense when the vectors we are dealing with are above a certain size, such as if we are generating many shares or sharing many secrets together.</p>
<h2 id="shamirs-scheme">Shamirâ€™s scheme</h2>
<p>In this scheme we can easily sample our polynomial directly in coefficient representation, and hence the FFT is only relevant in the second step where we generate the shares. Concretely, we can directly sample the polynomial with the desired number of coefficients to match our privacy threshold, and add extra zeros to get a number of coefficients matching the number of shares we want; below the former list is denoted as <code class="language-plaintext highlighter-rouge">small</code> and the latter as <code class="language-plaintext highlighter-rouge">large</code>. We then apply the forward FFT to turn this into a list of values that we take as the shares.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_share</span><span class="p">(</span><span class="n">secret</span><span class="p">):</span>
<span class="n">small_coeffs</span> <span class="o">=</span> <span class="p">[</span><span class="n">secret</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">T</span><span class="p">)]</span>
<span class="n">large_coeffs</span> <span class="o">=</span> <span class="n">small_coeffs</span> <span class="o">+</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">ORDER_LARGE</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">small_coeffs</span><span class="p">))</span>
<span class="n">large_values</span> <span class="o">=</span> <span class="n">fft3_forward</span><span class="p">(</span><span class="n">large_coeffs</span><span class="p">,</span> <span class="n">OMEGA_LARGE</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="n">large_values</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>Besides the privacy threshold <code class="language-plaintext highlighter-rouge">T</code> and the number of shares <code class="language-plaintext highlighter-rouge">N</code>, the parameters needed for the scheme is hence a prime <code class="language-plaintext highlighter-rouge">Q</code> and a generator <code class="language-plaintext highlighter-rouge">OMEGA_LARGE</code> of order <code class="language-plaintext highlighter-rouge">ORDER_LARGE == N + 1</code>.</p>
<p>Note that weâ€™ve used the FFT for powers of 3 here to be consistent with the next scheme; the FFT for powers of 2 would of course also have worked.</p>
<h2 id="packed-scheme">Packed scheme</h2>
<p>Recall that for this scheme it is less obvious how we can sample our polynomial directly in coefficient representation, and hence we instead do so in point-value representation. Specifically, we first use the backward FFT for powers of 2 to turn such a polynomial into coefficient representation, and then as above use the forward FFT for powers of 3 on this to generate the shares.</p>
<p>We are hence dealing with two sets of points: those used during sampling, and those used during share generation â€“ and these cannot overlap! If they did the privacy guarantee would no longer be satisfied and some of the shares might literally equal some of the secrets.</p>
<p>Preventing this from happening is the reason we use the two different bases 2 and 3: by picking co-prime bases, i.e. <code class="language-plaintext highlighter-rouge">gcd(2, 3) == 1</code>, the subgroups will only have the point 1 in common (as the two generators raised to the zeroth power). As such we are safe if we simply make sure to exclude the value at point 1 from being used. Recalling our walk-through example, this is the reason we used prime <code class="language-plaintext highlighter-rouge">Q == 433</code> since its order <code class="language-plaintext highlighter-rouge">Q-1 == 432 == 4 * 9 * k</code> is divided by both a power of 2 and a power of 3.</p>
<p>So to do sharing we first sample the values of the polynomial, fixing the value at point 1 to be a constant (in this case zero). Using the backward FFT we then turn this into a <code class="language-plaintext highlighter-rouge">small</code> list of coefficients, which we then as in Shamirâ€™s scheme extend with zero coefficients to get a <code class="language-plaintext highlighter-rouge">large</code> list of coefficients suitable for running through the forward FFT. Finally, since the first value obtained from this corresponds to point 1, and hence is the same as the constant used before, we remove it before returning the values as shares.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">packed_share</span><span class="p">(</span><span class="n">secrets</span><span class="p">):</span>
<span class="n">small_values</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">secrets</span> <span class="o">+</span> <span class="p">[</span><span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">T</span><span class="p">)]</span>
<span class="n">small_coeffs</span> <span class="o">=</span> <span class="n">fft2_backward</span><span class="p">(</span><span class="n">small_values</span><span class="p">,</span> <span class="n">OMEGA_SMALL</span><span class="p">)</span>
<span class="n">large_coeffs</span> <span class="o">=</span> <span class="n">small_coeffs</span> <span class="o">+</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="p">(</span><span class="n">ORDER_LARGE</span> <span class="o">-</span> <span class="n">ORDER_SMALL</span><span class="p">)</span>
<span class="n">large_values</span> <span class="o">=</span> <span class="n">fft3_forward</span><span class="p">(</span><span class="n">large_coeffs</span><span class="p">,</span> <span class="n">OMEGA_LARGE</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="n">large_values</span><span class="p">[</span><span class="mi">1</span><span class="p">:]</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>For this scheme, besides <code class="language-plaintext highlighter-rouge">T</code>, <code class="language-plaintext highlighter-rouge">N</code>, and the number <code class="language-plaintext highlighter-rouge">K</code> of secrets packed together, the parameters for this scheme is hence the prime <code class="language-plaintext highlighter-rouge">Q</code> and the two generators <code class="language-plaintext highlighter-rouge">OMEGA_SMALL</code> and <code class="language-plaintext highlighter-rouge">OMEGA_LARGE</code> of order respectively <code class="language-plaintext highlighter-rouge">ORDER_SMALL == T + K + 1</code> and <code class="language-plaintext highlighter-rouge">ORDER_LARGE == N + 1</code>.</p>
<p>We will talk more about how to do efficient reconstruction in the next blog post, but note that if all the shares are known then the above sharing procedure can efficiently be run backwards by simply running the two FFTs in their opposite direction.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">packed_reconstruct</span><span class="p">(</span><span class="n">shares</span><span class="p">):</span>
<span class="n">large_values</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">shares</span>
<span class="n">large_coeffs</span> <span class="o">=</span> <span class="n">fft3_backward</span><span class="p">(</span><span class="n">large_values</span><span class="p">,</span> <span class="n">OMEGA_LARGE</span><span class="p">)</span>
<span class="n">small_coeffs</span> <span class="o">=</span> <span class="n">large_coeffs</span><span class="p">[:</span><span class="n">ORDER_SMALL</span><span class="p">]</span>
<span class="n">small_values</span> <span class="o">=</span> <span class="n">fft2_forward</span><span class="p">(</span><span class="n">small_coeffs</span><span class="p">,</span> <span class="n">OMEGA_SMALL</span><span class="p">)</span>
<span class="n">secrets</span> <span class="o">=</span> <span class="n">small_values</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="n">K</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span>
<span class="k">return</span> <span class="n">secrets</span>
</code></pre></div></div>
<p>However this only works if all shares are known and correct: any loss or tampering will get in the way of using the FFT for reconstruction, unless we add an additional ingredient. Fixing this is the topic of the next blog post.</p>
<h2 id="performance-evaluation">Performance evaluation</h2>
<p>To test the performance impact of using the FFT for share generation in Shamirâ€™s scheme, we let the number of shares <code class="language-plaintext highlighter-rouge">N</code> take on values <code class="language-plaintext highlighter-rouge">2</code>, <code class="language-plaintext highlighter-rouge">8</code>, <code class="language-plaintext highlighter-rouge">26</code>, <code class="language-plaintext highlighter-rouge">80</code> and <code class="language-plaintext highlighter-rouge">242</code>, and for each of them compare against the typical approach of using Hornerâ€™s rule. For the former we have an asymptotic complexity of <code class="language-plaintext highlighter-rouge">Oh(N * log N)</code> while for the latter we have <code class="language-plaintext highlighter-rouge">Oh(N * T)</code>, and as such it is also interesting to vary <code class="language-plaintext highlighter-rouge">T</code>; we do so with <code class="language-plaintext highlighter-rouge">T = N/2</code> and <code class="language-plaintext highlighter-rouge">T = N/4</code>, representing respectively a medium and low privacy threshold.</p>
<p>All measures are in nanoseconds (1/1,000,000 milliseconds) and performed with <a href="https://github.com/mortendahl/rust-threshold-secret-sharing">our Rust implementation</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span><span class="mi">10</span><span class="p">))</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">26</span><span class="p">,</span> <span class="mi">80</span> <span class="p">]</span> <span class="c1">#, 242 ]
</span>
<span class="n">n2_fft</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">214</span><span class="p">,</span> <span class="mi">402</span><span class="p">,</span> <span class="mi">1012</span><span class="p">,</span> <span class="mi">2944</span> <span class="p">]</span> <span class="c1">#, 10525 ]
</span><span class="n">n2_horner</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">51</span><span class="p">,</span> <span class="mi">289</span><span class="p">,</span> <span class="mi">2365</span><span class="p">,</span> <span class="mi">22278</span> <span class="p">]</span> <span class="c1">#, 203630 ]
</span>
<span class="n">n4_fft</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">227</span><span class="p">,</span> <span class="mi">409</span><span class="p">,</span> <span class="mi">1038</span><span class="p">,</span> <span class="mi">3105</span> <span class="p">]</span> <span class="c1">#, 10470 ]
</span><span class="n">n4_horner</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">54</span><span class="p">,</span> <span class="mi">180</span><span class="p">,</span> <span class="mi">1380</span><span class="p">,</span> <span class="mi">11631</span> <span class="p">]</span> <span class="c1">#, 104388 ]
</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">n2_fft</span><span class="p">,</span> <span class="s">'ro--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'b'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'T = N/2: FFT'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">n2_horner</span><span class="p">,</span> <span class="s">'rs--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'r'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'T = N/2: Horner'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">n4_fft</span><span class="p">,</span> <span class="s">'ro--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'c'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'T = N/4: FFT'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">n4_horner</span><span class="p">,</span> <span class="s">'rs--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'T = N/4: Horner'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>
<p>Note that the numbers for <code class="language-plaintext highlighter-rouge">N = 242</code> are omitted in the graph to avoid hiding the results for the smaller values.</p>
<center><img src="https://mortendahl.github.io/assets/secret-sharing/share-performance-shamir.png" /></center>
<p>For the packed scheme we keep <code class="language-plaintext highlighter-rouge">T = N/4</code> and <code class="language-plaintext highlighter-rouge">K = N/2</code> fixed (meaning <code class="language-plaintext highlighter-rouge">R = 3N/4</code>) and let <code class="language-plaintext highlighter-rouge">N</code> vary as above. We then compare three different approaches for generating shares, all starting out with sampling a polynomial in point-value representation:</p>
<ol>
<li><code class="language-plaintext highlighter-rouge">FFT + FFT</code>: Backward FFT to convert into coefficient representation, followed by forward FFT for evaluation</li>
<li><code class="language-plaintext highlighter-rouge">FFT + Horner</code>: Backward FFT to convert into coefficient representation, followed by Hornerâ€™s rule for evaluation</li>
<li><code class="language-plaintext highlighter-rouge">Lagrange</code>: Use precomputed Lagrange constants for share points to directly obtain shares</li>
</ol>
<p>where the third option requires additional storage for the precomputed constants (computing them on the fly increases the running time significantly but can of course be amortized away if processing a large number of batches).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span><span class="mi">10</span><span class="p">))</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">26</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="mi">242</span> <span class="p">]</span>
<span class="n">fft_fft</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">840</span><span class="p">,</span> <span class="mi">1998</span><span class="p">,</span> <span class="mi">5288</span><span class="p">,</span> <span class="mi">15102</span> <span class="p">]</span>
<span class="n">fft_horner</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">898</span><span class="p">,</span> <span class="mi">3612</span><span class="p">,</span> <span class="mi">37641</span><span class="p">,</span> <span class="mi">207087</span> <span class="p">]</span>
<span class="n">lagrange_pre</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">246</span><span class="p">,</span> <span class="mi">1367</span><span class="p">,</span> <span class="mi">16510</span><span class="p">,</span> <span class="mi">102317</span> <span class="p">]</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">fft_fft</span><span class="p">,</span> <span class="s">'ro--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'b'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'FFT + FFT'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">fft_horner</span><span class="p">,</span> <span class="s">'ro--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'r'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'FFT + Horner'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">shares</span><span class="p">,</span> <span class="n">lagrange_pre</span><span class="p">,</span> <span class="s">'rs--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'Lagrange (precomp.)'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>
<p>We note that the Lagrange approach remains superior up to the setting with 26 shares, after which itâ€™s interesting to use the two step FFT.</p>
<center><img src="https://mortendahl.github.io/assets/secret-sharing/share-performance-packed.png" /></center>
<p>From this small amount of empirical data the FFT seems like the obvious choice as soon as the number of shares is sufficiently high. Question of course, is in which applications this is the case. We will explore this further in a future blog post (or see e.g. <a href="https://eprint.iacr.org/2017/643">our paper</a>).</p>
<h1 id="parameter-generation">Parameter Generation</h1>
<p>Since there are no security implications in re-using the same fixed set of parameters (i.e. <code class="language-plaintext highlighter-rouge">Q</code>, <code class="language-plaintext highlighter-rouge">OMEGA_SMALL</code>, and <code class="language-plaintext highlighter-rouge">OMEGA_LARGE</code>) across applications, parameter generation is perhaps less important compared to for instance key generation in encryption schemes. Nonetheless, one of the benefits of secret sharing schemes is their ability to avoid big expansion factors by using parameters tailored to the use case; concretely, to pick a field of just the right size. As such we shall now fill in this final piece of the puzzle and see how a set of parameters fitting with the FFTs used in the packed scheme can be generated.</p>
<p>Our main abstraction is the <code class="language-plaintext highlighter-rouge">generate_parameters</code> function which takes a desired minimum field size in bits, as well as the number of secrets <code class="language-plaintext highlighter-rouge">k</code> we which to packed together, the privacy threshold <code class="language-plaintext highlighter-rouge">t</code> we want, and the number <code class="language-plaintext highlighter-rouge">n</code> of shares to generate. Accounting for the value at point 1 that we are throwing away (see earlier), to be suitable for the two FFTs, we must then have that <code class="language-plaintext highlighter-rouge">k + t + 1</code> is a power of 2 and that <code class="language-plaintext highlighter-rouge">n + 1</code> is a power of 3.</p>
<p>To next make sure that our field has two subgroups with those number of elements, we simply need to find a field whose order is divided by both numbers. Specifically, since weâ€™re considering prime fields, we need to find a prime <code class="language-plaintext highlighter-rouge">q</code> such that its order <code class="language-plaintext highlighter-rouge">q-1</code> is divided by both sizes. Finally, we also need a generator <code class="language-plaintext highlighter-rouge">g</code> of the field, which can be turned into generators <code class="language-plaintext highlighter-rouge">omega_small</code> and <code class="language-plaintext highlighter-rouge">omega_large</code> of the subgroups.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_parameters</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">,</span> <span class="n">k</span><span class="p">,</span> <span class="n">t</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span>
<span class="n">order_small</span> <span class="o">=</span> <span class="n">k</span> <span class="o">+</span> <span class="n">t</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">order_large</span> <span class="o">=</span> <span class="n">n</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">order_divisor</span> <span class="o">=</span> <span class="n">order_small</span> <span class="o">*</span> <span class="n">order_large</span>
<span class="n">q</span><span class="p">,</span> <span class="n">g</span> <span class="o">=</span> <span class="n">find_prime_field</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">,</span> <span class="n">order_divisor</span><span class="p">)</span>
<span class="n">order</span> <span class="o">=</span> <span class="n">q</span> <span class="o">-</span> <span class="mi">1</span>
<span class="n">omega_small</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="n">order</span> <span class="o">//</span> <span class="n">order_small</span><span class="p">,</span> <span class="n">q</span><span class="p">)</span>
<span class="n">omega_large</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">g</span><span class="p">,</span> <span class="n">order</span> <span class="o">//</span> <span class="n">order_large</span><span class="p">,</span> <span class="n">q</span><span class="p">)</span>
<span class="k">return</span> <span class="n">q</span><span class="p">,</span> <span class="n">omega_small</span><span class="p">,</span> <span class="n">omega_large</span>
</code></pre></div></div>
<p>Finding our <code class="language-plaintext highlighter-rouge">q</code> and <code class="language-plaintext highlighter-rouge">g</code> is done by <code class="language-plaintext highlighter-rouge">find_prime_field</code>, which works by first finding a prime of the right size and with the right order. To then also find the generator we need a piece of auxiliary information, namely the prime factors in the order.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">find_prime_field</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">,</span> <span class="n">order_divisor</span><span class="p">):</span>
<span class="n">q</span><span class="p">,</span> <span class="n">order_prime_factors</span> <span class="o">=</span> <span class="n">find_prime</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">,</span> <span class="n">order_divisor</span><span class="p">)</span>
<span class="n">g</span> <span class="o">=</span> <span class="n">find_generator</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">order_prime_factors</span><span class="p">)</span>
<span class="k">return</span> <span class="n">q</span><span class="p">,</span> <span class="n">g</span>
</code></pre></div></div>
<p>The reason for this is that we can use the prime factors of the order to efficiently test whether an arbitrary candidate element in the field is in fact a generator with that order. This follows from <a href="https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)">Lagrangeâ€™s theorem</a> as detailed in standard textbooks on the matter.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">find_generator</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">order_prime_factors</span><span class="p">):</span>
<span class="n">order</span> <span class="o">=</span> <span class="n">q</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">for</span> <span class="n">candidate</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">q</span><span class="p">):</span>
<span class="k">for</span> <span class="n">factor</span> <span class="ow">in</span> <span class="n">order_prime_factors</span><span class="p">:</span>
<span class="n">exponent</span> <span class="o">=</span> <span class="n">order</span> <span class="o">//</span> <span class="n">factor</span>
<span class="k">if</span> <span class="nb">pow</span><span class="p">(</span><span class="n">candidate</span><span class="p">,</span> <span class="n">exponent</span><span class="p">,</span> <span class="n">q</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="k">break</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">candidate</span>
</code></pre></div></div>
<p>This leaves us with only a few remaining question regarding finding prime numbers as explained next.</p>
<h2 id="finding-primes">Finding primes</h2>
<p>To find a prime <code class="language-plaintext highlighter-rouge">q</code> with the desired structure (i.e. of a certain minimum size and whose order <code class="language-plaintext highlighter-rouge">q-1</code> has a given divisor) we may either do rejection sampling of primes until we hit one that satisfies our need, or we may construct it from smaller parts so that it by design fits with what we need. The latter appears more efficient so that is what we will do here.</p>
<p>Specifically, given <code class="language-plaintext highlighter-rouge">min_bitsize</code> and <code class="language-plaintext highlighter-rouge">order_divisor</code> we will do rejection sampling over two values <code class="language-plaintext highlighter-rouge">k1</code> and <code class="language-plaintext highlighter-rouge">k2</code> until <code class="language-plaintext highlighter-rouge">q = k1 * k2 * order_divisor + 1</code> is a <a href="https://en.wikipedia.org/wiki/Probable_prime">probable prime</a>. The <code class="language-plaintext highlighter-rouge">k1</code> is used to ensure that the minimum size is met, and <code class="language-plaintext highlighter-rouge">k2</code> is used to give us a bit of wiggle room â€“ it can in principle be omitted, but empirical tests show that it doesnâ€™t have to be very large it give an efficiency boost, at the expense of potentially overshooting the desired field size by a few bits. Finally, since we also need to know the prime factorization of <code class="language-plaintext highlighter-rouge">q - 1</code>, and since this in general is believed to be an <a href="https://en.wikipedia.org/wiki/Integer_factorization">inherently slow process</a>, we by construction ensure that <code class="language-plaintext highlighter-rouge">k1</code> is a prime so that we only have to factor <code class="language-plaintext highlighter-rouge">k2</code> and <code class="language-plaintext highlighter-rouge">order_divisor</code>, which we assume to be somewhat small and hence doable.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">find_prime</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">,</span> <span class="n">order_divisor</span><span class="p">):</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">k1</span> <span class="o">=</span> <span class="n">sample_prime</span><span class="p">(</span><span class="n">min_bitsize</span><span class="p">)</span>
<span class="k">for</span> <span class="n">k2</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">128</span><span class="p">):</span>
<span class="n">q</span> <span class="o">=</span> <span class="n">k1</span> <span class="o">*</span> <span class="n">k2</span> <span class="o">*</span> <span class="n">order_divisor</span> <span class="o">+</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">is_prime</span><span class="p">(</span><span class="n">q</span><span class="p">):</span>
<span class="n">order_prime_factors</span> <span class="o">=</span> <span class="p">[</span><span class="n">k1</span><span class="p">]</span>
<span class="n">order_prime_factors</span> <span class="o">+=</span> <span class="n">prime_factor</span><span class="p">(</span><span class="n">k2</span><span class="p">)</span>
<span class="n">order_prime_factors</span> <span class="o">+=</span> <span class="n">prime_factor</span><span class="p">(</span><span class="n">order_divisor</span><span class="p">)</span>
<span class="k">return</span> <span class="n">q</span><span class="p">,</span> <span class="n">order_prime_factors</span>
</code></pre></div></div>
<p>Sampling primes are done using a standard <a href="https://en.wikipedia.org/wiki/Primality_test">randomized primality test</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">sample_prime</span><span class="p">(</span><span class="n">bitsize</span><span class="p">):</span>
<span class="n">lower</span> <span class="o">=</span> <span class="mi">1</span> <span class="o"><<</span> <span class="p">(</span><span class="n">bitsize</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="n">upper</span> <span class="o">=</span> <span class="mi">1</span> <span class="o"><<</span> <span class="p">(</span><span class="n">bitsize</span><span class="p">)</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">candidate</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">lower</span><span class="p">,</span> <span class="n">upper</span><span class="p">)</span>
<span class="k">if</span> <span class="n">is_prime</span><span class="p">(</span><span class="n">candidate</span><span class="p">):</span>
<span class="k">return</span> <span class="n">candidate</span>
</code></pre></div></div>
<p>And factoring a number is done by simply trying a fixed set of all small primes in sequence; this will of course not work if the input is too large, but that is not likely to happen in real-world applications.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">prime_factor</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="n">factors</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">prime</span> <span class="ow">in</span> <span class="n">SMALL_PRIMES</span><span class="p">:</span>
<span class="k">if</span> <span class="n">prime</span> <span class="o">></span> <span class="n">x</span><span class="p">:</span> <span class="k">break</span>
<span class="k">if</span> <span class="n">x</span> <span class="o">%</span> <span class="n">prime</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">factors</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">prime</span><span class="p">)</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">remove_factor</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">prime</span><span class="p">)</span>
<span class="k">assert</span><span class="p">(</span><span class="n">x</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">return</span> <span class="n">factors</span>
</code></pre></div></div>
<p>Putting these pieces together we end up with an efficient procedure for generating parameters for use with FFTs: finding large fields of size e.g. 128bits is a matter of milliseconds.</p>
<h1 id="next-steps">Next Steps</h1>
<p>While we have seen that the Fast Fourier Transform can be used to greatly speed up the sharing process, it has a serious limitation when it comes to speeding up the reconstruction process: in its current form it requires all shares to be present and untampered with. As such, for some applications we may be forced to resort to the more traditional and slower approaches of <a href="https://en.wikipedia.org/wiki/Newton_polynomial">Newton</a> or <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Laplace</a> interpolation.</p>
<p>In <a href="/2017/08/13/secret-sharing-part3/">the next blog post</a> we will look at a technique for also using the Fast Fourier Transform for reconstruction, using techniques from error correction codes to account for missing or faulty shares, yet get similar speedup benefits to what we achieved here.</p>Morten DahlTL;DR: efficient secret sharing requires fast polynomial evaluation and interpolation; here we go through what it takes to use the well-known Fast Fourier Transform for this.Secret Sharing, Part 12017-06-04T12:00:00+00:002017-06-04T12:00:00+00:00https://mortendahl.github.io/2017/06/04/secret-sharing-part1<p><em><strong>TL;DR:</strong> first part in a series where we look at secret sharing schemes, including the lesser known packed variant of Shamirâ€™s scheme, and give full and efficient implementations; here we start with the textbook approaches, with follow-up posts focusing on improvements from more advanced techniques for <a href="/2017/06/24/secret-sharing-part2">sharing</a> and <a href="/2017/08/13/secret-sharing-part3">reconstruction</a>.</em></p>
<p><a href="https://en.wikipedia.org/wiki/Secret_sharing">Secret sharing</a> is an old well-known cryptographic primitive, with existing real-world applications in e.g. <a href="https://bitcoinmagazine.com/articles/threshold-signatures-new-standard-wallet-security-1425937098">Bitcoin signatures</a> and <a href="https://www.vaultproject.io/docs/internals/security.html">password management</a>. But perhaps more interestingly, secret sharing also has strong links to <a href="https://en.wikipedia.org/wiki/Secure_multi-party_computation">secure computation</a> and may for instance be used for <a href="/2017/04/17/private-deep-learning-with-mpc/">private machine learning</a>.</p>
<p>The essence of the primitive is that a <em>dealer</em> wants to split a <em>secret</em> into several <em>shares</em> given to <em>shareholders</em>, in such a way that each individual shareholder learns nothing about the secret, yet if sufficiently many re-combine their shares then the secret can be reconstructed. Intuitively, the question of <em>trust</em> changes from being about the integrity of a single individual to the non-collaboration of several parties: it becomes distributed.</p>
<p>Secret sharing schemes are also interesting from a performance point of view, as they typically rely on a bare minimum of cryptographic assumptions. In particular, by not having to make any assumptions about the hardness of certain problems such as <a href="https://en.wikipedia.org/wiki/RSA_problem">factoring integers</a>, <a href="https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange">computing discrete logarithms</a>, or <a href="https://en.wikipedia.org/wiki/Ring_Learning_with_Errors">finding short vectors</a>, secret sharing schemes can provide a computational advantage compared to other cryptographic tools such as <a href="https://en.wikipedia.org/wiki/Homomorphic_encryption">homomorphic encryption</a>.</p>
<p>In this post weâ€™ll look at a few concrete secret sharing schemes, as well as hints on how to implement them efficiently (with a later post going into more detail). We wonâ€™t focus too much on applications but simply use private aggregation of large vectors as a running example â€“ see e.g. our <a href="TODO">paper</a> for more use cases.</p>
<p>There is a Python notebook containing <a href="https://github.com/mortendahl/privateml/blob/master/secret-sharing/Schemes.ipynb">the code samples</a>, yet for better performance our <a href="https://crates.io/crates/threshold-secret-sharing">open source Rust library</a> is recommended.</p>
<p><em>
Parts of this blog post are derived from work done at <a href="https://snips.ai/">Snips</a> and <a href="https://medium.com/snips-ai/high-volume-secret-sharing-2e7dc5b41e9a">originally appearing in another blog post</a>. That work also included parts of the Rust implementation.
</em></p>
<h1 id="additive-sharing">Additive Sharing</h1>
<p>Letâ€™s first assume that we have fixed a <a href="https://en.wikipedia.org/wiki/Finite_field">finite field</a> to which all secrets and shares belong, and in which all computation take place; this could for instance be <a href="https://en.wikipedia.org/wiki/Modular_arithmetic">the integers modulo a prime number</a>, i.e. <code class="language-plaintext highlighter-rouge">{ 0, 1, ..., Q-1 }</code> for a prime <code class="language-plaintext highlighter-rouge">Q</code>.</p>
<p>An easy way to split a secret <code class="language-plaintext highlighter-rouge">x</code> from this field into say three shares <code class="language-plaintext highlighter-rouge">x1</code>, <code class="language-plaintext highlighter-rouge">x2</code>, <code class="language-plaintext highlighter-rouge">x3</code>, is to simply pick <code class="language-plaintext highlighter-rouge">x1</code> and <code class="language-plaintext highlighter-rouge">x2</code> at random and let <code class="language-plaintext highlighter-rouge">x3 = x - x1 - x2</code>. As argued below, this hides the secret as long as no one knows more than two shares, yet if all three shares are known then <code class="language-plaintext highlighter-rouge">x</code> can be reconstructed by simply computing <code class="language-plaintext highlighter-rouge">x1 + x2 + x3</code>. More generally, this scheme is known as <em>additive sharing</em> and works for any <code class="language-plaintext highlighter-rouge">N</code> number of shares by picking <code class="language-plaintext highlighter-rouge">T = N - 1</code> random values.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">additive_share</span><span class="p">(</span><span class="n">secret</span><span class="p">):</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">N</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">]</span>
<span class="n">shares</span> <span class="o">+=</span> <span class="p">[</span> <span class="p">(</span><span class="n">secret</span> <span class="o">-</span> <span class="nb">sum</span><span class="p">(</span><span class="n">shares</span><span class="p">))</span> <span class="o">%</span> <span class="n">Q</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">shares</span>
<span class="k">def</span> <span class="nf">additive_reconstruct</span><span class="p">(</span><span class="n">shares</span><span class="p">):</span>
<span class="k">return</span> <span class="nb">sum</span><span class="p">(</span><span class="n">shares</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
</code></pre></div></div>
<p>That the secret remains hidden as long as at most <code class="language-plaintext highlighter-rouge">T = N - 1</code> shareholders collaborate follows from the marginal distribution of the view of up to <code class="language-plaintext highlighter-rouge">T</code> shareholders being independent of the secret. More intuitively, given at most <code class="language-plaintext highlighter-rouge">T</code> shares, <em>any</em> guess one may make at what the secret could be, can be explained by the remaining unseen share, and is hence an equally valid guess.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">explain</span><span class="p">(</span><span class="n">seen_shares</span><span class="p">,</span> <span class="n">guess</span><span class="p">):</span>
<span class="c1"># compute the unseen share that justifies the seen shares and the guess
</span> <span class="n">simulated_unseen_share</span> <span class="o">=</span> <span class="p">(</span><span class="n">guess</span> <span class="o">-</span> <span class="nb">sum</span><span class="p">(</span><span class="n">seen_shares</span><span class="p">))</span> <span class="o">%</span> <span class="n">Q</span>
<span class="c1"># and the would-be sharing by combining seen and unseen shares
</span> <span class="n">simulated_shares</span> <span class="o">=</span> <span class="n">seen_shares</span> <span class="o">+</span> <span class="p">[</span><span class="n">simulated_unseen_share</span><span class="p">]</span>
<span class="k">if</span> <span class="n">additive_reconstruct</span><span class="p">(</span><span class="n">simulated_shares</span><span class="p">)</span> <span class="o">==</span> <span class="n">guess</span><span class="p">:</span>
<span class="c1"># found an explanation
</span> <span class="k">return</span> <span class="n">simulated_unseen_share</span>
<span class="n">seen_shares</span> <span class="o">=</span> <span class="n">shares</span><span class="p">[:</span><span class="n">N</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
<span class="k">for</span> <span class="n">guess</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">Q</span><span class="p">):</span>
<span class="n">explanation</span> <span class="o">=</span> <span class="n">explain</span><span class="p">(</span><span class="n">seen_shares</span><span class="p">,</span> <span class="n">guess</span><span class="p">)</span>
<span class="k">if</span> <span class="n">explanation</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"guess </span><span class="si">%</span><span class="s">d can be explained by </span><span class="si">%</span><span class="s">d"</span> <span class="o">%</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">explanation</span><span class="p">))</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>guess 0 can be explained by 28
guess 1 can be explained by 29
guess 2 can be explained by 30
guess 3 can be explained by 31
guess 4 can be explained by 32
guess 5 can be explained by 33
...
</code></pre></div></div>
<p>And since all we need for this argument to go through is the ability to sample random field elements, with no additional constraints on the size of the field due to e.g. hardness assumptions, this scheme is highly efficient both in terms of time and space.</p>
<h2 id="homomorphic-addition">Homomorphic addition</h2>
<p>While it is also about as simple as it gets, notice that the scheme already has a homomorphic property that allows for certain degrees of secure computation: we can add secrets together, so if e.g. <code class="language-plaintext highlighter-rouge">x1</code>, <code class="language-plaintext highlighter-rouge">x2</code>, <code class="language-plaintext highlighter-rouge">x3</code> is a sharing of <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y1</code>, <code class="language-plaintext highlighter-rouge">y2</code>, <code class="language-plaintext highlighter-rouge">y3</code> is a sharing of <code class="language-plaintext highlighter-rouge">y</code>, then <code class="language-plaintext highlighter-rouge">x1+y1</code>, <code class="language-plaintext highlighter-rouge">x2+y2</code>, <code class="language-plaintext highlighter-rouge">x3+y3</code> is a sharing of <code class="language-plaintext highlighter-rouge">x + y</code>, which can be computed individually by the three shareholders simply by adding the shares they already have (respectively <code class="language-plaintext highlighter-rouge">x1</code> and <code class="language-plaintext highlighter-rouge">y1</code>, <code class="language-plaintext highlighter-rouge">x2</code> and <code class="language-plaintext highlighter-rouge">y2</code>, and <code class="language-plaintext highlighter-rouge">x3</code> and <code class="language-plaintext highlighter-rouge">y3</code>). Then, once added, these new shares can be used reconstruct the result of the addition but not the addends.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">additive_add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">return</span> <span class="p">[</span> <span class="p">(</span><span class="n">xi</span> <span class="o">+</span> <span class="n">yi</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">xi</span><span class="p">,</span> <span class="n">yi</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">]</span>
</code></pre></div></div>
<p>More generally, we can ask the shareholders to compute linear functions of secret inputs without them seeing anything but the shares, and without learning anything besides the final output of the function.</p>
<h1 id="comparing-schemes">Comparing schemes</h1>
<p>While the above scheme is particularly simple, below are two examples of slightly more advanced schemes. One way to compare these is through the following four parameters:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">N</code>: the number of shares that each secret is split into</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">R</code>: the minimum number of shares needed to reconstruct the secret</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">T</code>: the maximum number of shares that may be seen without learning nothing about the secret, also known as the <em>privacy threshold</em></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">K</code>: the number of secrets shared together</p>
</li>
</ul>
<p>where, logically, we must have <code class="language-plaintext highlighter-rouge">R <= N</code> since otherwise reconstruction is never possible, and we must have <code class="language-plaintext highlighter-rouge">T < R</code> since otherwise privacy makes little sense.</p>
<p>For the additive scheme we have <code class="language-plaintext highlighter-rouge">R = N</code>, <code class="language-plaintext highlighter-rouge">K = 1</code>, and <code class="language-plaintext highlighter-rouge">T = R - K</code>, but below we will get rid of the first two of these constraints so that in the end we are free to choose the parameters any way we like as long as <code class="language-plaintext highlighter-rouge">T + K = R <= N</code>.</p>
<h1 id="shamirs-scheme">Shamirâ€™s Scheme</h1>
<p>The additive scheme lacks some robustness by the constraint that <code class="language-plaintext highlighter-rouge">R = N</code>, meaning that if one of the shareholders for some reason becomes unavailable or losses his share then reconstruction is no longer possible. By moving to a different scheme we can remove this constraint and let <code class="language-plaintext highlighter-rouge">R</code> (and hence also <code class="language-plaintext highlighter-rouge">T</code>) be free to choose for any particular application.</p>
<p>In <a href="https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing">Shamirâ€™s scheme</a>, instead of picking random field elements that sum up to the secret <code class="language-plaintext highlighter-rouge">x</code> as we did above, to share <code class="language-plaintext highlighter-rouge">x</code> we sample a random polynomial <code class="language-plaintext highlighter-rouge">f</code> with the condition that <code class="language-plaintext highlighter-rouge">f(0) = x</code> and evaluate this polynomial at <code class="language-plaintext highlighter-rouge">N</code> non-zero points to obtain the shares as <code class="language-plaintext highlighter-rouge">f(1)</code>, <code class="language-plaintext highlighter-rouge">f(2)</code>, â€¦, <code class="language-plaintext highlighter-rouge">f(N)</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_share</span><span class="p">(</span><span class="n">secret</span><span class="p">):</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="n">sample_shamir_polynomial</span><span class="p">(</span><span class="n">secret</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="n">evaluate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SHARE_POINTS</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>And by varying the degree of <code class="language-plaintext highlighter-rouge">f</code> we can choose how many shares are needed before reconstruction is possible, thereby removing the <code class="language-plaintext highlighter-rouge">R = N</code> constraint. More specifically, if the degree of <code class="language-plaintext highlighter-rouge">f</code> is <code class="language-plaintext highlighter-rouge">T</code> then we know from <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">interpolation</a> that it is uniquely identified by either its <code class="language-plaintext highlighter-rouge">T+1</code> coefficients or by its value at <code class="language-plaintext highlighter-rouge">T+1</code> points, so that <code class="language-plaintext highlighter-rouge">R = T+1</code> shares allow us to reliably reconstruct.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_reconstruct</span><span class="p">(</span><span class="n">shares</span><span class="p">):</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">SHARE_POINTS</span><span class="p">,</span> <span class="n">shares</span><span class="p">)</span> <span class="k">if</span> <span class="n">v</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="p">]</span>
<span class="n">secret</span> <span class="o">=</span> <span class="n">interpolate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="k">return</span> <span class="n">secret</span>
</code></pre></div></div>
<p>And at the same time, given at most <code class="language-plaintext highlighter-rouge">T</code> shares, the secret is again guaranteed to be hidden since we also here can find an explanation for any guess: a guess is the value of <code class="language-plaintext highlighter-rouge">f</code> at point zero, so together with the <code class="language-plaintext highlighter-rouge">T</code> known shares, interpolation allows us to find a polynomial with the right degree that matches all values.</p>
<p>Before discussing how these operations can be done efficiently, letâ€™s first see the properties this scheme has in terms of secure computation.</p>
<h2 id="homomorphic-addition-and-multiplication">Homomorphic addition and multiplication</h2>
<p>Since it holds for polynomials in general that <code class="language-plaintext highlighter-rouge">f(i) + g(i) = (f + g)(i)</code>, we also here have an additive homomorphic property that allows us to compute linear functions of secrets by simply adding the individual shares.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_add</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">return</span> <span class="p">[</span> <span class="p">(</span><span class="n">xi</span> <span class="o">+</span> <span class="n">yi</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">xi</span><span class="p">,</span> <span class="n">yi</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">]</span>
</code></pre></div></div>
<p>And because it also holds that <code class="language-plaintext highlighter-rouge">f(i) * g(i) = (f * g)(i)</code>, we in fact now have an additional multiplicative property that allows us to compute products in the same fashion.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shamir_mul</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
<span class="k">return</span> <span class="p">[</span> <span class="p">(</span><span class="n">xi</span> <span class="o">*</span> <span class="n">yi</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span> <span class="k">for</span> <span class="n">xi</span><span class="p">,</span> <span class="n">yi</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">]</span>
</code></pre></div></div>
<p>But while this is in principle enough to perform <em>any</em> computation without seeing the inputs (since addition and multiplication can be used to express any <a href="https://en.wikipedia.org/wiki/Boolean_circuit">boolean circuit</a>), it also comes with a caveat: unlike addition, every multiplication doubles the degree of the polynomial, so we need <code class="language-plaintext highlighter-rouge">2T+1</code> shares to reconstruct a product instead of <code class="language-plaintext highlighter-rouge">T+1</code>.</p>
<p>As a result, when used in secure computation, additional steps must be taken to reduce the degree after even a small number of multiplications, which typically involve some level of interaction between the shareholders. In this light, when compared to homomorphic encryption, secret sharing in some respect replaces heavy computation with interaction.</p>
<h2 id="the-missing-pieces">The missing pieces</h2>
<p>Above we ignored the questions of how to efficiently sample, evaluate, and interpolate polynomials. The first one is easy. We want a random <code class="language-plaintext highlighter-rouge">T</code> degree polynomial with the constraint that <code class="language-plaintext highlighter-rouge">f(0) = x</code>, and we may obtain that by simply letting the zero-degree coefficient be <code class="language-plaintext highlighter-rouge">x</code> and picking the remaining <code class="language-plaintext highlighter-rouge">T</code> coefficients at random: <code class="language-plaintext highlighter-rouge">f(X) = (x) + (r1 * X^1) + (r2 * X^2) + ... + (rT * X^T)</code> where <code class="language-plaintext highlighter-rouge">x</code> is the secret, <code class="language-plaintext highlighter-rouge">X</code> the indeterminate, and <code class="language-plaintext highlighter-rouge">r1</code>, â€¦, <code class="language-plaintext highlighter-rouge">rT</code> the random coefficients.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">sample_shamir_polynomial</span><span class="p">(</span><span class="n">zero_value</span><span class="p">):</span>
<span class="n">coefs</span> <span class="o">=</span> <span class="p">[</span><span class="n">zero_value</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">T</span><span class="p">)</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">coefs</span>
</code></pre></div></div>
<p>This gives us the polynomial in coefficient representation, which means we can perform the second task of evaluating the polynomial at <code class="language-plaintext highlighter-rouge">N</code> points somewhat efficiently using e.g. <a href="https://en.wikipedia.org/wiki/Horner%27s_method">Hornerâ€™s rule</a>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">evaluate_at_point</span><span class="p">(</span><span class="n">coefs</span><span class="p">,</span> <span class="n">point</span><span class="p">):</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">coef</span> <span class="ow">in</span> <span class="nb">reversed</span><span class="p">(</span><span class="n">coefs</span><span class="p">):</span>
<span class="n">result</span> <span class="o">=</span> <span class="p">(</span><span class="n">coef</span> <span class="o">+</span> <span class="n">point</span> <span class="o">*</span> <span class="n">result</span><span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">result</span>
</code></pre></div></div>
<p>The interpolation step needed in reconstruction is slightly trickier. Here the polynomial is instead given in a point-value representation consisting of <code class="language-plaintext highlighter-rouge">T+1</code> pairs <code class="language-plaintext highlighter-rouge">(pi, vi)</code> that is less obviously suitable for computing <code class="language-plaintext highlighter-rouge">f(0)</code>.</p>
<p>However, using <a href="https://en.wikipedia.org/wiki/Lagrange_polynomial">Lagrange interpolation</a> we can express the value of a polynomial at any point by a weighted sum of a set of constants and its value at <code class="language-plaintext highlighter-rouge">T+1</code> other points.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">interpolate_at_point</span><span class="p">(</span><span class="n">points_values</span><span class="p">,</span> <span class="n">point</span><span class="p">):</span>
<span class="n">points</span><span class="p">,</span> <span class="n">values</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="n">points_values</span><span class="p">)</span>
<span class="n">constants</span> <span class="o">=</span> <span class="n">lagrange_constants_for_point</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">point</span><span class="p">)</span>
<span class="k">return</span> <span class="nb">sum</span><span class="p">(</span> <span class="n">ci</span> <span class="o">*</span> <span class="n">vi</span> <span class="k">for</span> <span class="n">ci</span><span class="p">,</span> <span class="n">vi</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">constants</span><span class="p">,</span> <span class="n">values</span><span class="p">)</span> <span class="p">)</span> <span class="o">%</span> <span class="n">Q</span>
</code></pre></div></div>
<p>Moreover, since these <em>Lagrange constants</em> depend only on the points and not on the values, their computation can be amortized away in case we have to perform several interpolations, as in our running example with large vectors of secrets.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">lagrange_constants_for_point</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">point</span><span class="p">):</span>
<span class="n">constants</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">points</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">points</span><span class="p">)):</span>
<span class="n">xi</span> <span class="o">=</span> <span class="n">points</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
<span class="n">num</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">denum</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">points</span><span class="p">)):</span>
<span class="k">if</span> <span class="n">j</span> <span class="o">!=</span> <span class="n">i</span><span class="p">:</span>
<span class="n">xj</span> <span class="o">=</span> <span class="n">points</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="n">num</span> <span class="o">=</span> <span class="p">(</span><span class="n">num</span> <span class="o">*</span> <span class="p">(</span><span class="n">xj</span> <span class="o">-</span> <span class="n">point</span><span class="p">))</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">denum</span> <span class="o">=</span> <span class="p">(</span><span class="n">denum</span> <span class="o">*</span> <span class="p">(</span><span class="n">xj</span> <span class="o">-</span> <span class="n">xi</span><span class="p">))</span> <span class="o">%</span> <span class="n">Q</span>
<span class="n">constants</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">num</span> <span class="o">*</span> <span class="n">inverse</span><span class="p">(</span><span class="n">denum</span><span class="p">))</span> <span class="o">%</span> <span class="n">Q</span>
<span class="k">return</span> <span class="n">constants</span>
</code></pre></div></div>
<p>Looking back at the sharing and reconstruction operations, we then see that the former takes <code class="language-plaintext highlighter-rouge">Oh(N * T)</code> steps (for each secret) and the latter <code class="language-plaintext highlighter-rouge">Oh(T)</code> steps (for each secret) if precomputation is allowed.</p>
<h1 id="packed-variant">Packed Variant</h1>
<p>While Shamirâ€™s scheme gets rid of the <code class="language-plaintext highlighter-rouge">R = N</code> constraint and gives us flexibility in choosing <code class="language-plaintext highlighter-rouge">T</code> or <code class="language-plaintext highlighter-rouge">R</code>, it still has the limitation that <code class="language-plaintext highlighter-rouge">K = 1</code>. This means that each shareholder receives one share per secret, so a large number of secrets means a large number of shares for each shareholder. By using a generalised variant of Shamirâ€™s scheme known as packed or ramp sharing, we can remove this limitation and reduce the load on each individual shareholder.</p>
<p>To share a vector of <code class="language-plaintext highlighter-rouge">K</code> secrets <code class="language-plaintext highlighter-rouge">x = [x1, x2, ..., xK]</code>, the shares are still computed as <code class="language-plaintext highlighter-rouge">f(1)</code>, <code class="language-plaintext highlighter-rouge">f(2)</code>, â€¦, <code class="language-plaintext highlighter-rouge">f(N)</code> but the random polynomial is now sampled such that it satisfies <code class="language-plaintext highlighter-rouge">f(-1) = x1</code>, <code class="language-plaintext highlighter-rouge">f(-2) = x2</code>, â€¦, <code class="language-plaintext highlighter-rouge">f(-K) = xK</code>.</p>
<p>Since itâ€™s less obvious how to sample such a polynomial in coefficient representation as we did before, to achieve the desires privacy threshold we instead add <code class="language-plaintext highlighter-rouge">T</code> additional constraints <code class="language-plaintext highlighter-rouge">f(-K-1) = r1</code>, â€¦, <code class="language-plaintext highlighter-rouge">f(-K-T) = rT</code> and simply use a point-value representation of the degree <code class="language-plaintext highlighter-rouge">T+K-1</code> polynomial.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">sample_packed_polynomial</span><span class="p">(</span><span class="n">secrets</span><span class="p">):</span>
<span class="n">points</span> <span class="o">=</span> <span class="n">SECRET_POINTS</span> <span class="o">+</span> <span class="n">RANDOMNESS_POINTS</span>
<span class="n">values</span> <span class="o">=</span> <span class="n">secrets</span> <span class="o">+</span> <span class="p">[</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="n">Q</span><span class="p">)</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">T</span><span class="p">)</span> <span class="p">]</span>
<span class="k">return</span> <span class="nb">list</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">))</span>
</code></pre></div></div>
<p>This however means that we now have to perform interpolation instead of evaluation during sharing, which has an impact on efficiency, even when using precomputation as it now means storing <code class="language-plaintext highlighter-rouge">N</code> different sets of constants.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">packed_share</span><span class="p">(</span><span class="n">secrets</span><span class="p">):</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="n">sample_packed_polynomial</span><span class="p">(</span><span class="n">secrets</span><span class="p">)</span>
<span class="n">shares</span> <span class="o">=</span> <span class="p">[</span> <span class="n">interpolate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SHARE_POINTS</span> <span class="p">]</span>
<span class="k">return</span> <span class="n">shares</span>
</code></pre></div></div>
<p>As we will see in the next blog post it is in fact also possible to sample a packed polynomial in the coefficient representation and regain efficient sharing, but it requires slightly more advanced techniques.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">packed_reconstruct</span><span class="p">(</span><span class="n">shares</span><span class="p">):</span>
<span class="n">points</span> <span class="o">=</span> <span class="n">SHARE_POINTS</span>
<span class="n">values</span> <span class="o">=</span> <span class="n">shares</span>
<span class="n">polynomial</span> <span class="o">=</span> <span class="p">[</span> <span class="p">(</span><span class="n">p</span><span class="p">,</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span><span class="p">,</span><span class="n">v</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">points</span><span class="p">,</span> <span class="n">values</span><span class="p">)</span> <span class="k">if</span> <span class="n">v</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="p">]</span>
<span class="k">return</span> <span class="p">[</span> <span class="n">interpolate_at_point</span><span class="p">(</span><span class="n">polynomial</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">SECRET_POINTS</span> <span class="p">]</span>
</code></pre></div></div>
<p>Leaving computational efficiency aside, with this scheme we have reduced the number of shares each shareholder gets by a factor of <code class="language-plaintext highlighter-rouge">K</code>, which is useful in our running example of aggregating large vectors.</p>
<p>However thereâ€™s another a caveat: since the degree of the polynomial increased, from <code class="language-plaintext highlighter-rouge">T</code> to <code class="language-plaintext highlighter-rouge">T + K - 1</code>, we also have to adjust either the privacy threshold or the number of shares needed to reconstruct.</p>
<p>For example, say we use Shamirâ€™s scheme to share a secret between <code class="language-plaintext highlighter-rouge">N = 10</code> shareholders and want a privacy guarantee against up to half of them collaborating, i.e. <code class="language-plaintext highlighter-rouge">T = 5</code>. Plugging this into our equation we get <code class="language-plaintext highlighter-rouge">5 + 1 = 6 <= 10</code> for Shamirâ€™s scheme, meaning we can tolerate that up to <code class="language-plaintext highlighter-rouge">N - R = 4</code>, or 40%, of them go missing. However, if we use the packed scheme to share <code class="language-plaintext highlighter-rouge">K = 3</code> secrets together then we get <code class="language-plaintext highlighter-rouge">5 + 3 = 8 <= 10</code> and the tolerance drops to 20%.</p>
<p>One remedy is to simply multiply all parameters by <code class="language-plaintext highlighter-rouge">K</code>; in the example we get <code class="language-plaintext highlighter-rouge">15 + 3 = 18 <= 30</code> and we are back to the original privacy threshold of half the shareholders and tolerance of 40%. The cost is that we now also need <code class="language-plaintext highlighter-rouge">K</code> times as many shareholders, so we have effectively kept the same number of shares but distributed them across a larger population.</p>
<p>(Note that a similar distribution may be achieved by partitioning the secrets and shareholders into <code class="language-plaintext highlighter-rouge">K</code> groups; this however has a negative effect on overall tolerance as we need <code class="language-plaintext highlighter-rouge">R</code> shares from each group.)</p>
<h2 id="homomorphic-addition-and-multiplication-1">Homomorphic addition and multiplication</h2>
<p>The scheme has the same homomorphic properties as Shamirâ€™s, yet now operate in a <a href="https://en.wikipedia.org/wiki/SIMD">SIMD</a> fashion where each addition or multiplication is simultaneously performed on every secret shared together. This in itself can have benefits if it fits naturally with the application.</p>
<h1 id="next-steps">Next Steps</h1>
<p>Although an old and simple primitive, secret sharing has several properties that makes it interesting as a way of delegating trust and computation to e.g. a community of users, even if the devices of these users are somewhat inefficient and unreliable.</p>
<p>In this post we have seen a few classical schemes as well as a typical textbook algorithms to implement them. <a href="/2017/06/24/secret-sharing-part2">The next blog post</a> will improve on these algorithms and obtain significantly better performance.</p>Morten DahlTL;DR: first part in a series where we look at secret sharing schemes, including the lesser known packed variant of Shamirâ€™s scheme, and give full and efficient implementations; here we start with the textbook approaches, with follow-up posts focusing on improvements from more advanced techniques for sharing and reconstruction.