diff --git a/.gitignore b/.gitignore
index e69de29..937ba84 100644
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1 @@
+/llvm-3.9.1.src.tar.xz
diff --git a/0001-This-code-block-breaks-the-docs-build-http-lab.llvm..patch b/0001-This-code-block-breaks-the-docs-build-http-lab.llvm..patch
new file mode 100644
index 0000000..9d43070
--- /dev/null
+++ b/0001-This-code-block-breaks-the-docs-build-http-lab.llvm..patch
@@ -0,0 +1,30 @@
+From ce04fe5f8eb9f3a27504db75672083c8aaf80ddd Mon Sep 17 00:00:00 2001
+From: Aaron Ballman <aaron@aaronballman.com>
+Date: Tue, 19 Jul 2016 17:46:55 +0000
+Subject: [PATCH] This code block breaks the docs build
+ (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11920/steps/docs-llvm-html/logs/stdio),
+ but I cannot see anything immediately wrong with it and cannot reproduce the
+ diagnostic locally. Setting the code highlighting to none instead of nasm to
+ hopefully get the bot stumbling back towards green.
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275998 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ docs/AMDGPUUsage.rst | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/docs/AMDGPUUsage.rst b/docs/AMDGPUUsage.rst
+index 34a9b60..7d1ef11 100644
+--- a/docs/AMDGPUUsage.rst
++++ b/docs/AMDGPUUsage.rst
+@@ -171,7 +171,7 @@ keys, see the comments in lib/Target/AMDGPU/AmdKernelCodeT.h
+ 
+ Here is an example of a minimal amd_kernel_code_t specification:
+ 
+-.. code-block:: nasm
++.. code-block:: none
+ 
+    .hsa_code_object_version 1,0
+    .hsa_code_object_isa
+-- 
+2.5.5
+
diff --git a/0001-cmake-Install-CheckAtomic.cmake-needed-by-lldb.patch b/0001-cmake-Install-CheckAtomic.cmake-needed-by-lldb.patch
new file mode 100644
index 0000000..d317e39
--- /dev/null
+++ b/0001-cmake-Install-CheckAtomic.cmake-needed-by-lldb.patch
@@ -0,0 +1,39 @@
+From fdda55bb968b2c39da76baa85a29114f53154944 Mon Sep 17 00:00:00 2001
+From: Chris Bieneman <beanz@apple.com>
+Date: Thu, 25 Aug 2016 20:53:00 +0000
+Subject: [PATCH] cmake: Install CheckAtomic.cmake (needed by lldb)
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Summary:
+Install CheckAtomic.cmake along with other LLVM modules, therefore making it possible for other projects to use it. This file is needed for LLDB to be built standalone, and installing it was suggested in https://reviews.llvm.org/D23881.
+
+Patch by: Michał Górny
+
+Reviewers: krytarowski, zturner, eugenis, jyknight, labath, beanz
+
+Subscribers: beanz, llvm-commits
+
+Differential Revision: https://reviews.llvm.org/D23887
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279777 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ cmake/modules/CMakeLists.txt | 3 +--
+ 1 file changed, 1 insertion(+), 2 deletions(-)
+
+diff --git a/cmake/modules/CMakeLists.txt b/cmake/modules/CMakeLists.txt
+index 826dd36..d2510b8 100644
+--- a/cmake/modules/CMakeLists.txt
++++ b/cmake/modules/CMakeLists.txt
+@@ -91,6 +91,5 @@ if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+     PATTERN LLVMConfig.cmake EXCLUDE
+     PATTERN LLVMConfigVersion.cmake EXCLUDE
+     PATTERN LLVM-Config.cmake EXCLUDE
+-    PATTERN GetHostTriple.cmake EXCLUDE
+-    PATTERN CheckAtomic.cmake EXCLUDE)
++    PATTERN GetHostTriple.cmake EXCLUDE)
+ endif()
+-- 
+2.5.5
+
diff --git a/0001-docs-fix-cmake-code-block-warning.patch b/0001-docs-fix-cmake-code-block-warning.patch
new file mode 100644
index 0000000..da63112
--- /dev/null
+++ b/0001-docs-fix-cmake-code-block-warning.patch
@@ -0,0 +1,44 @@
+From f12c36b2bc2e1db86098c181b88b8003c595e63c Mon Sep 17 00:00:00 2001
+From: Renato Golin <renato.golin@linaro.org>
+Date: Wed, 20 Jul 2016 09:47:09 +0000
+Subject: [PATCH] [docs] fix cmake code-block warning
+
+This will unblock the llvm-sphinx-buildbot, which is currently failing due
+to a warning being treated as error.
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276100 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ docs/CMakePrimer.rst | 8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+diff --git a/docs/CMakePrimer.rst b/docs/CMakePrimer.rst
+index 0347790..1e3a09e 100644
+--- a/docs/CMakePrimer.rst
++++ b/docs/CMakePrimer.rst
+@@ -246,11 +246,11 @@ In general CMake if blocks work the way you'd expect:
+ .. code-block:: cmake
+ 
+   if(<condition>)
+-    .. do stuff
++    message("do stuff")
+   elseif(<condition>)
+-    .. do other stuff
++    message("do other stuff")
+   else()
+-    .. do other other stuff
++    message("do other other stuff")
+   endif()
+ 
+ The single most important thing to know about CMake's if blocks coming from a C
+@@ -265,7 +265,7 @@ The most common form of the CMake ``foreach`` block is:
+ .. code-block:: cmake
+ 
+   foreach(var ...)
+-    .. do stuff
++    message("do stuff")
+   endforeach()
+ 
+ The variable argument portion of the ``foreach`` block can contain dereferenced
+-- 
+2.5.5
+
diff --git a/0001-fix-docs-2.patch b/0001-fix-docs-2.patch
new file mode 100644
index 0000000..988cb64
--- /dev/null
+++ b/0001-fix-docs-2.patch
@@ -0,0 +1,38 @@
+From de4fbfe93560c78f29c8b92cafab0793f5d26bc6 Mon Sep 17 00:00:00 2001
+From: Aaron Ballman <aaron@aaronballman.com>
+Date: Tue, 19 Jul 2016 20:20:03 +0000
+Subject: [PATCH] This code block breaks the docs build
+ (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11921/steps/docs-llvm-html/logs/stdio).
+ Setting the code highlighting to none instead of llvm to hopefully get the
+ bot stumbling back towards green.
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276018 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ docs/BitCodeFormat.rst | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/docs/BitCodeFormat.rst b/docs/BitCodeFormat.rst
+index ffa2176..89c7c1b 100644
+--- a/docs/BitCodeFormat.rst
++++ b/docs/BitCodeFormat.rst
+@@ -596,7 +596,7 @@ will be encoded as 1.
+ 
+ For example, instead of
+ 
+-.. code-block:: llvm
++.. code-block:: none
+ 
+   #n = load #n-1
+   #n+1 = icmp eq #n, #const0
+@@ -604,7 +604,7 @@ For example, instead of
+ 
+ version 1 will encode the instructions as
+ 
+-.. code-block:: llvm
++.. code-block:: none
+ 
+   #n = load #1
+   #n+1 = icmp eq #1, (#n+1)-#const0
+-- 
+2.5.5
+
diff --git a/0001-fix-docs-3.patch b/0001-fix-docs-3.patch
new file mode 100644
index 0000000..f46b57f
--- /dev/null
+++ b/0001-fix-docs-3.patch
@@ -0,0 +1,46 @@
+From 9871423412faa2ed8380445a26ed1b0991a18502 Mon Sep 17 00:00:00 2001
+From: Aaron Ballman <aaron@aaronballman.com>
+Date: Tue, 19 Jul 2016 23:50:11 +0000
+Subject: [PATCH] This code block breaks the docs build
+ (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11925/steps/docs-llvm-html/logs/stdio).
+ Setting the code highlighting to none instead of llvm.
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276060 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ docs/BranchWeightMetadata.rst | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/docs/BranchWeightMetadata.rst b/docs/BranchWeightMetadata.rst
+index 6cbcb0f..9e61d23 100644
+--- a/docs/BranchWeightMetadata.rst
++++ b/docs/BranchWeightMetadata.rst
+@@ -29,7 +29,7 @@ Supported Instructions
+ Metadata is only assigned to the conditional branches. There are two extra
+ operands for the true and the false branch.
+ 
+-.. code-block:: llvm
++.. code-block:: none
+ 
+   !0 = metadata !{
+     metadata !"branch_weights",
+@@ -43,7 +43,7 @@ operands for the true and the false branch.
+ Branch weights are assigned to every case (including the ``default`` case which
+ is always case #0).
+ 
+-.. code-block:: llvm
++.. code-block:: none
+ 
+   !0 = metadata !{
+     metadata !"branch_weights",
+@@ -56,7 +56,7 @@ is always case #0).
+ 
+ Branch weights are assigned to every destination.
+ 
+-.. code-block:: llvm
++.. code-block:: none
+ 
+   !0 = metadata !{
+     metadata !"branch_weights",
+-- 
+2.5.5
+
diff --git a/install_dirs.patch b/install_dirs.patch
new file mode 100644
index 0000000..3ebe853
--- /dev/null
+++ b/install_dirs.patch
@@ -0,0 +1,392 @@
+diff -up llvm-3.9.1.src/bindings/ocaml/backends/CMakeLists.txt.instdirs llvm-3.9.1.src/bindings/ocaml/backends/CMakeLists.txt
+--- llvm-3.9.1.src/bindings/ocaml/backends/CMakeLists.txt.instdirs	2014-12-29 20:24:07.000000000 -0700
++++ llvm-3.9.1.src/bindings/ocaml/backends/CMakeLists.txt	2017-02-13 13:36:43.999154756 -0700
+@@ -23,5 +23,5 @@ foreach(TARGET ${LLVM_TARGETS_TO_BUILD})
+     "${LLVM_LIBRARY_DIR}/ocaml/META.llvm_${TARGET}")
+ 
+   install(FILES "${LLVM_LIBRARY_DIR}/ocaml/META.llvm_${TARGET}"
+-          DESTINATION lib/ocaml)
++          DESTINATION ${CMAKE_INSTALL_LIBDIR}/ocaml)
+ endforeach()
+diff -up llvm-3.9.1.src/bindings/ocaml/llvm/CMakeLists.txt.instdirs llvm-3.9.1.src/bindings/ocaml/llvm/CMakeLists.txt
+--- llvm-3.9.1.src/bindings/ocaml/llvm/CMakeLists.txt.instdirs	2014-12-29 20:24:07.000000000 -0700
++++ llvm-3.9.1.src/bindings/ocaml/llvm/CMakeLists.txt	2017-02-13 13:36:43.999154756 -0700
+@@ -8,4 +8,4 @@ configure_file(
+   "${LLVM_LIBRARY_DIR}/ocaml/META.llvm")
+ 
+ install(FILES "${LLVM_LIBRARY_DIR}/ocaml/META.llvm"
+-        DESTINATION lib/ocaml)
++        DESTINATION ${CMAKE_INSTALL_LIBDIR}/ocaml)
+diff -up llvm-3.9.1.src/CMakeLists.txt.instdirs llvm-3.9.1.src/CMakeLists.txt
+--- llvm-3.9.1.src/CMakeLists.txt.instdirs	2016-09-13 07:44:50.000000000 -0600
++++ llvm-3.9.1.src/CMakeLists.txt	2017-02-13 13:36:44.003154733 -0700
+@@ -194,13 +194,15 @@ if (CMAKE_BUILD_TYPE AND
+ endif()
+ 
+ set(LLVM_LIBDIR_SUFFIX "" CACHE STRING "Define suffix of library directory name (32/64)" )
+-
+-set(LLVM_TOOLS_INSTALL_DIR "bin" CACHE STRING "Path for binary subdirectory (defaults to 'bin')")
+-mark_as_advanced(LLVM_TOOLS_INSTALL_DIR)
++set(CMAKE_INSTALL_BINDIR bin CACHE STRING "Path for binary subdirectory relative to prefix (defaults to 'bin')" )
++set(CMAKE_INSTALL_LIBDIR lib${LLVM_LIBDIR_SUFFIX} CACHE STRING "Path for library subdirectory (defaults to 'lib${LLVM_LIBDIR_SUFFIX}'" )
++set(CMAKE_INSTALL_INCLUDEDIR include CACHE STRING "Path for include subdirectory relative to prefix (defaults to 'include'" )
++set(CMAKE_INSTALL_DOCDIR share/doc/${project} CACHE STRING "Path for documentation subdirectory relative to prefix (defaults to 'share/doc/${project}')" )
++set(CMAKE_INSTALL_MANDIR share/man CACHE STRING "Path for manpages subdirectory relative to prefix (defaults to 'share/man')" )
+ 
+ # They are used as destination of target generators.
+-set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_CFG_INTDIR}/bin)
+-set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX})
++set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_CFG_INTDIR}/${CMAKE_INSTALL_BINDIR})
++set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_CFG_INTDIR}/${CMAKE_INSTALL_LIBDIR})
+ if(WIN32 OR CYGWIN)
+   # DLL platform -- put DLLs into bin.
+   set(LLVM_SHLIB_OUTPUT_INTDIR ${LLVM_RUNTIME_OUTPUT_INTDIR})
+@@ -613,8 +615,8 @@ configure_file(
+ 
+ # They are not referenced. See set_output_directory().
+ set( CMAKE_RUNTIME_OUTPUT_DIRECTORY ${LLVM_BINARY_DIR}/bin )
+-set( CMAKE_LIBRARY_OUTPUT_DIRECTORY ${LLVM_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX} )
+-set( CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${LLVM_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX} )
++set( CMAKE_LIBRARY_OUTPUT_DIRECTORY ${LLVM_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR} )
++set( CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${LLVM_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR} )
+ 
+ set(CMAKE_BUILD_WITH_INSTALL_RPATH ON)
+ if (APPLE)
+@@ -622,7 +624,7 @@ if (APPLE)
+   set(CMAKE_INSTALL_RPATH "@executable_path/../lib")
+ else(UNIX)
+   if(NOT DEFINED CMAKE_INSTALL_RPATH)
+-    set(CMAKE_INSTALL_RPATH "\$ORIGIN/../lib${LLVM_LIBDIR_SUFFIX}")
++    set(CMAKE_INSTALL_RPATH "\$ORIGIN/../${CMAKE_INSTALL_LIBDIR}")
+     if(${CMAKE_SYSTEM_NAME} MATCHES "(FreeBSD|DragonFly)")
+       set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,-z,origin")
+       set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,-z,origin")
+@@ -799,7 +801,7 @@ add_subdirectory(cmake/modules)
+ 
+ if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+   install(DIRECTORY include/llvm include/llvm-c
+-    DESTINATION include
++    DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
+     COMPONENT llvm-headers
+     FILES_MATCHING
+     PATTERN "*.def"
+@@ -811,7 +813,7 @@ if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+     )
+ 
+   install(DIRECTORY ${LLVM_INCLUDE_DIR}/llvm
+-    DESTINATION include
++    DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
+     COMPONENT llvm-headers
+     FILES_MATCHING
+     PATTERN "*.def"
+diff -up llvm-3.9.1.src/cmake/modules/AddLLVM.cmake.instdirs llvm-3.9.1.src/cmake/modules/AddLLVM.cmake
+--- llvm-3.9.1.src/cmake/modules/AddLLVM.cmake.instdirs	2016-07-09 20:43:47.000000000 -0600
++++ llvm-3.9.1.src/cmake/modules/AddLLVM.cmake	2017-02-13 13:36:44.012154680 -0700
+@@ -546,7 +558,7 @@ macro(add_llvm_library name)
+     set_target_properties( ${name} PROPERTIES EXCLUDE_FROM_ALL ON)
+   elseif(NOT _is_gtest)
+     if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY OR ${name} STREQUAL "LTO")
+-      set(install_dir lib${LLVM_LIBDIR_SUFFIX})
++      set(install_dir ${CMAKE_INSTALL_LIBDIR})
+       if(ARG_SHARED OR BUILD_SHARED_LIBS)
+         if(WIN32 OR CYGWIN OR MINGW)
+           set(install_type RUNTIME)
+@@ -590,12 +602,12 @@ macro(add_llvm_loadable_module name)
+           # DLL platform
+           set(dlldir "bin")
+         else()
+-          set(dlldir "lib${LLVM_LIBDIR_SUFFIX}")
++          set(dlldir "${CMAKE_INSTALL_LIBDIR}")
+         endif()
+         install(TARGETS ${name}
+           EXPORT LLVMExports
+           LIBRARY DESTINATION ${dlldir}
+-          ARCHIVE DESTINATION lib${LLVM_LIBDIR_SUFFIX})
++          ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR})
+       endif()
+       set_property(GLOBAL APPEND PROPERTY LLVM_EXPORTS ${name})
+     endif()
+@@ -770,7 +782,7 @@ macro(add_llvm_tool name)
+     if( LLVM_BUILD_TOOLS )
+       install(TARGETS ${name}
+               EXPORT LLVMExports
+-              RUNTIME DESTINATION ${LLVM_TOOLS_INSTALL_DIR}
++              RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
+               COMPONENT ${name})
+ 
+       if (NOT CMAKE_CONFIGURATION_TYPES)
+@@ -795,7 +807,7 @@ macro(add_llvm_example name)
+   endif()
+   add_llvm_executable(${name} ${ARGN})
+   if( LLVM_BUILD_EXAMPLES )
+-    install(TARGETS ${name} RUNTIME DESTINATION examples)
++    install(TARGETS ${name} RUNTIME DESTINATION ${CMAKE_INSTALL_DOCDIR}/examples)
+   endif()
+   set_target_properties(${name} PROPERTIES FOLDER "Examples")
+ endmacro(add_llvm_example name)
+@@ -811,7 +823,7 @@ macro(add_llvm_utility name)
+   set_target_properties(${name} PROPERTIES FOLDER "Utils")
+   if( LLVM_INSTALL_UTILS AND LLVM_BUILD_UTILS )
+     install (TARGETS ${name}
+-      RUNTIME DESTINATION bin
++      RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
+       COMPONENT ${name})
+     if (NOT CMAKE_CONFIGURATION_TYPES)
+       add_custom_target(install-${name}
+@@ -1173,7 +1185,7 @@ function(llvm_install_library_symlink na
+   set(full_name ${CMAKE_${type}_LIBRARY_PREFIX}${name}${CMAKE_${type}_LIBRARY_SUFFIX})
+   set(full_dest ${CMAKE_${type}_LIBRARY_PREFIX}${dest}${CMAKE_${type}_LIBRARY_SUFFIX})
+ 
+-  set(output_dir lib${LLVM_LIBDIR_SUFFIX})
++  set(output_dir ${CMAKE_INSTALL_LIBDIR})
+   if(WIN32 AND "${type}" STREQUAL "SHARED")
+     set(output_dir bin)
+   endif()
+@@ -1210,7 +1222,7 @@ function(llvm_install_symlink name dest)
+   set(full_dest ${dest}${CMAKE_EXECUTABLE_SUFFIX})
+ 
+   install(SCRIPT ${INSTALL_SYMLINK}
+-          CODE "install_symlink(${full_name} ${full_dest} ${LLVM_TOOLS_INSTALL_DIR})"
++          CODE "install_symlink(${full_name} ${full_dest} ${CMAKE_INSTALL_BINDIR})"
+           COMPONENT ${component})
+ 
+   if (NOT CMAKE_CONFIGURATION_TYPES AND NOT ARG_ALWAYS_GENERATE)
+diff -up llvm-3.9.1.src/cmake/modules/AddOCaml.cmake.instdirs llvm-3.9.1.src/cmake/modules/AddOCaml.cmake
+--- llvm-3.9.1.src/cmake/modules/AddOCaml.cmake.instdirs	2016-06-21 17:10:37.000000000 -0600
++++ llvm-3.9.1.src/cmake/modules/AddOCaml.cmake	2017-02-13 13:36:44.001154744 -0700
+@@ -189,12 +189,12 @@ function(add_ocaml_library name)
+   endforeach()
+ 
+   install(FILES ${install_files}
+-          DESTINATION lib/ocaml)
++          DESTINATION ${CMAKE_INSTALL_LIBDIR}/ocaml)
+   install(FILES ${install_shlibs}
+           PERMISSIONS OWNER_READ OWNER_WRITE OWNER_EXECUTE
+                       GROUP_READ GROUP_EXECUTE
+                       WORLD_READ WORLD_EXECUTE
+-          DESTINATION lib/ocaml)
++          DESTINATION ${CMAKE_INSTALL_LIBDIR}/ocaml)
+ 
+   foreach( install_file ${install_files} ${install_shlibs} )
+     get_filename_component(filename "${install_file}" NAME)
+diff -up llvm-3.9.1.src/cmake/modules/AddSphinxTarget.cmake.instdirs llvm-3.9.1.src/cmake/modules/AddSphinxTarget.cmake
+--- llvm-3.9.1.src/cmake/modules/AddSphinxTarget.cmake.instdirs	2014-08-14 05:57:16.000000000 -0600
++++ llvm-3.9.1.src/cmake/modules/AddSphinxTarget.cmake	2017-02-13 13:36:44.001154744 -0700
+@@ -50,11 +50,11 @@ function (add_sphinx_target builder proj
+       if (builder STREQUAL man)
+         # FIXME: We might not ship all the tools that these man pages describe
+         install(DIRECTORY "${SPHINX_BUILD_DIR}/" # Slash indicates contents of
+-                DESTINATION share/man/man1)
++                DESTINATION ${CMAKE_INSTALL_MANDIR}/man1)
+ 
+       elseif (builder STREQUAL html)
+         install(DIRECTORY "${SPHINX_BUILD_DIR}"
+-                DESTINATION "share/doc/${project}")
++                DESTINATION ${CMAKE_INSTALL_DOCDIR})
+       else()
+         message(WARNING Installation of ${builder} not supported)
+       endif()
+diff -up llvm-3.9.1.src/cmake/modules/CMakeLists.txt.instdirs llvm-3.9.1.src/cmake/modules/CMakeLists.txt
+--- llvm-3.9.1.src/cmake/modules/CMakeLists.txt.instdirs	2017-02-13 13:36:43.995154779 -0700
++++ llvm-3.9.1.src/cmake/modules/CMakeLists.txt	2017-02-13 13:40:40.508732673 -0700
+@@ -1,4 +1,4 @@
+-set(LLVM_INSTALL_PACKAGE_DIR lib${LLVM_LIBDIR_SUFFIX}/cmake/llvm)
++set(LLVM_INSTALL_PACKAGE_DIR ${CMAKE_INSTALL_LIBDIR}/cmake/llvm CACHE STRING "Path for CMake subdirectory (defaults to 'cmake/llvm')")
+ set(llvm_cmake_builddir "${LLVM_BINARY_DIR}/${LLVM_INSTALL_PACKAGE_DIR}")
+ 
+ get_property(LLVM_EXPORTS GLOBAL PROPERTY LLVM_EXPORTS)
+@@ -49,20 +49,12 @@ file(COPY .
+ 
+ # Generate LLVMConfig.cmake for the install tree.
+ set(LLVM_CONFIG_CODE "
+-# Compute the installation prefix from this LLVMConfig.cmake file location.
+-get_filename_component(LLVM_INSTALL_PREFIX \"\${CMAKE_CURRENT_LIST_FILE}\" PATH)")
+-# Construct the proper number of get_filename_component(... PATH)
+-# calls to compute the installation prefix.
+-string(REGEX REPLACE "/" ";" _count "${LLVM_INSTALL_PACKAGE_DIR}")
+-foreach(p ${_count})
+-  set(LLVM_CONFIG_CODE "${LLVM_CONFIG_CODE}
+-get_filename_component(LLVM_INSTALL_PREFIX \"\${LLVM_INSTALL_PREFIX}\" PATH)")
+-endforeach(p)
+-set(LLVM_CONFIG_INCLUDE_DIRS "\${LLVM_INSTALL_PREFIX}/include")
+-set(LLVM_CONFIG_LIBRARY_DIRS "\${LLVM_INSTALL_PREFIX}/lib\${LLVM_LIBDIR_SUFFIX}")
++set(LLVM_INSTALL_PREFIX \"${CMAKE_INSTALL_PREFIX}\")")
++set(LLVM_CONFIG_INCLUDE_DIRS "\${LLVM_INSTALL_PREFIX}/${CMAKE_INSTALL_INCLUDEDIR}")
++set(LLVM_CONFIG_LIBRARY_DIRS "\${LLVM_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
+ set(LLVM_CONFIG_CMAKE_DIR "\${LLVM_INSTALL_PREFIX}/${LLVM_INSTALL_PACKAGE_DIR}")
+ set(LLVM_CONFIG_BINARY_DIR "\${LLVM_INSTALL_PREFIX}")
+-set(LLVM_CONFIG_TOOLS_BINARY_DIR "\${LLVM_INSTALL_PREFIX}/bin")
++set(LLVM_CONFIG_TOOLS_BINARY_DIR "\${LLVM_INSTALL_PREFIX}/${CMAKE_INSTALL_BINDIR}")
+ set(LLVM_CONFIG_EXPORTS_FILE "\${LLVM_CMAKE_DIR}/LLVMExports.cmake")
+ configure_file(
+   LLVMConfig.cmake.in
+diff -up llvm-3.9.1.src/cmake/modules/LLVMConfig.cmake.in.instdirs llvm-3.9.1.src/cmake/modules/LLVMConfig.cmake.in
+--- llvm-3.9.1.src/cmake/modules/LLVMConfig.cmake.in.instdirs	2016-07-01 08:22:52.000000000 -0600
++++ llvm-3.9.1.src/cmake/modules/LLVMConfig.cmake.in	2017-02-13 13:36:44.002154738 -0700
+@@ -59,7 +59,7 @@ set(LLVM_DEFINITIONS "@LLVM_DEFINITIONS@
+ set(LLVM_CMAKE_DIR "@LLVM_CONFIG_CMAKE_DIR@")
+ set(LLVM_BINARY_DIR "@LLVM_CONFIG_BINARY_DIR@")
+ set(LLVM_TOOLS_BINARY_DIR "@LLVM_CONFIG_TOOLS_BINARY_DIR@")
+-set(LLVM_TOOLS_INSTALL_DIR "@LLVM_TOOLS_INSTALL_DIR@")
++set(LLVM_TOOLS_INSTALL_DIR "@CMAKE_INSTALL_BINDIR@")
+ 
+ if(NOT TARGET LLVMSupport)
+   set(LLVM_EXPORTED_TARGETS "@LLVM_EXPORTS@")
+diff -up llvm-3.9.1.src/cmake/modules/TableGen.cmake.instdirs llvm-3.9.1.src/cmake/modules/TableGen.cmake
+--- llvm-3.9.1.src/cmake/modules/TableGen.cmake.instdirs	2016-06-08 15:19:26.000000000 -0600
++++ llvm-3.9.1.src/cmake/modules/TableGen.cmake	2017-02-13 13:47:59.832154520 -0700
+@@ -6,7 +6,6 @@ function(tablegen project ofn)
+   # Validate calling context.
+   foreach(v
+       ${project}_TABLEGEN_EXE
+-      LLVM_MAIN_SRC_DIR
+       LLVM_MAIN_INCLUDE_DIR
+       )
+     if(NOT ${v})
+@@ -23,10 +22,14 @@ function(tablegen project ofn)
+     set(LLVM_TARGET_DEFINITIONS_ABSOLUTE
+       ${CMAKE_CURRENT_SOURCE_DIR}/${LLVM_TARGET_DEFINITIONS})
+   endif()
++  if (LLVM_MAIN_SRC_DIR)
++    set(TABLEGEN_INCLUDES -I ${LLVM_MAIN_SRC_DIR}/lib/Target)
++  endif()
++  set(TABLEGEN_INCLUDES ${TABLEGEN_INCLUDES} -I ${LLVM_MAIN_INCLUDE_DIR})
+   add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${ofn}.tmp
+     # Generate tablegen output in a temporary file.
+     COMMAND ${${project}_TABLEGEN_EXE} ${ARGN} -I ${CMAKE_CURRENT_SOURCE_DIR}
+-    -I ${LLVM_MAIN_SRC_DIR}/lib/Target -I ${LLVM_MAIN_INCLUDE_DIR}
++    ${TABLEGEN_INCLUDES}
+     ${LLVM_TARGET_DEFINITIONS_ABSOLUTE}
+     -o ${CMAKE_CURRENT_BINARY_DIR}/${ofn}.tmp
+     # The file in LLVM_TARGET_DEFINITIONS may be not in the current
+@@ -141,7 +144,7 @@ macro(add_tablegen target project)
+   if (${project} STREQUAL LLVM AND NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+     install(TARGETS ${target}
+             EXPORT LLVMExports
+-            RUNTIME DESTINATION ${LLVM_TOOLS_INSTALL_DIR})
++            RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
+   endif()
+   set_property(GLOBAL APPEND PROPERTY LLVM_EXPORTS ${target})
+ endmacro()
+diff -up llvm-3.9.1.src/docs/CMakeLists.txt.instdirs llvm-3.9.1.src/docs/CMakeLists.txt
+--- llvm-3.9.1.src/docs/CMakeLists.txt.instdirs	2015-08-17 17:24:17.000000000 -0600
++++ llvm-3.9.1.src/docs/CMakeLists.txt	2017-02-13 13:36:44.004154727 -0700
+@@ -94,7 +94,7 @@ if (LLVM_ENABLE_DOXYGEN)
+ 
+   if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+     install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/doxygen/html
+-      DESTINATION docs/html)
++      DESTINATION ${CMAKE_INSTALL_DOCDIR}/html)
+   endif()
+ endif()
+ endif()
+@@ -155,6 +155,6 @@ if( NOT uses_ocaml LESS 0 )
+ 
+   if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+     install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html
+-      DESTINATION docs/ocaml/html)
++      DESTINATION ${CMAKE_INSTALL_DOCDIR}/ocaml/html)
+   endif()
+ endif()
+diff -up llvm-3.9.1.src/include/llvm/CMakeLists.txt.instdirs llvm-3.9.1.src/include/llvm/CMakeLists.txt
+--- llvm-3.9.1.src/include/llvm/CMakeLists.txt.instdirs	2014-08-13 18:51:47.000000000 -0600
++++ llvm-3.9.1.src/include/llvm/CMakeLists.txt	2017-02-13 13:36:44.004154727 -0700
+@@ -3,5 +3,5 @@ add_subdirectory(IR)
+ # If we're doing an out-of-tree build, copy a module map for generated
+ # header files into the build area.
+ if (NOT "${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_BINARY_DIR}")
+-  configure_file(module.modulemap.build module.modulemap COPYONLY)
++  configure_file(module.modulemap.build ${LLVM_INCLUDE_DIR}/module.modulemap COPYONLY)
+ endif (NOT "${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_BINARY_DIR}")
+diff -up llvm-3.9.1.src/tools/llvm-config/BuildVariables.inc.in.instdirs llvm-3.9.1.src/tools/llvm-config/BuildVariables.inc.in
+--- llvm-3.9.1.src/tools/llvm-config/BuildVariables.inc.in.instdirs	2016-03-07 17:02:50.000000000 -0700
++++ llvm-3.9.1.src/tools/llvm-config/BuildVariables.inc.in	2017-02-13 13:36:44.005154721 -0700
+@@ -23,7 +23,9 @@
+ #define LLVM_LDFLAGS "@LLVM_LDFLAGS@"
+ #define LLVM_CXXFLAGS "@LLVM_CXXFLAGS@"
+ #define LLVM_BUILDMODE "@LLVM_BUILDMODE@"
+-#define LLVM_LIBDIR_SUFFIX "@LLVM_LIBDIR_SUFFIX@"
++#define LLVM_BINARY_DIR "@CMAKE_INSTALL_BINDIR@"
++#define LLVM_LIBRARY_DIR "@CMAKE_INSTALL_LIBDIR@"
++#define LLVM_INCLUDE_DIR "@CMAKE_INSTALL_INCLUDEDIR@"
+ #define LLVM_TARGETS_BUILT "@LLVM_TARGETS_BUILT@"
+ #define LLVM_SYSTEM_LIBS "@LLVM_SYSTEM_LIBS@"
+ #define LLVM_BUILD_SYSTEM "@LLVM_BUILD_SYSTEM@"
+diff -up llvm-3.9.1.src/tools/llvm-config/llvm-config.cpp.instdirs llvm-3.9.1.src/tools/llvm-config/llvm-config.cpp
+--- llvm-3.9.1.src/tools/llvm-config/llvm-config.cpp.instdirs	2016-03-14 15:39:58.000000000 -0600
++++ llvm-3.9.1.src/tools/llvm-config/llvm-config.cpp	2017-02-13 13:36:44.006154715 -0700
+@@ -290,7 +290,7 @@ int main(int argc, char **argv) {
+     DevelopmentTreeLayout = CMakeStyle;
+     ActiveObjRoot = LLVM_OBJ_ROOT;
+   } else if (sys::fs::equivalent(CurrentExecPrefix,
+-                                 Twine(LLVM_OBJ_ROOT) + "/bin")) {
++                                 Twine(LLVM_OBJ_ROOT) + "/" + LLVM_BINARY_DIR)) {
+     IsInDevelopmentTree = true;
+     DevelopmentTreeLayout = CMakeBuildModeStyle;
+     ActiveObjRoot = LLVM_OBJ_ROOT;
+@@ -304,32 +304,32 @@ int main(int argc, char **argv) {
+   std::string ActivePrefix, ActiveBinDir, ActiveIncludeDir, ActiveLibDir;
+   std::string ActiveIncludeOption;
+   if (IsInDevelopmentTree) {
+-    ActiveIncludeDir = std::string(LLVM_SRC_ROOT) + "/include";
++    ActiveIncludeDir = std::string(LLVM_SRC_ROOT) + "/" + LLVM_INCLUDE_DIR;
+     ActivePrefix = CurrentExecPrefix;
+ 
+     // CMake organizes the products differently than a normal prefix style
+     // layout.
+     switch (DevelopmentTreeLayout) {
+     case CMakeStyle:
+-      ActiveBinDir = ActiveObjRoot + "/bin";
+-      ActiveLibDir = ActiveObjRoot + "/lib" + LLVM_LIBDIR_SUFFIX;
++      ActiveBinDir = ActiveObjRoot + "/" + LLVM_BINARY_DIR;
++      ActiveLibDir = ActiveObjRoot + LLVM_LIBRARY_DIR;
+       break;
+     case CMakeBuildModeStyle:
+       ActivePrefix = ActiveObjRoot;
+-      ActiveBinDir = ActiveObjRoot + "/bin/" + build_mode;
++      ActiveBinDir = ActiveObjRoot + "/" + LLVM_BINARY_DIR + "/" + build_mode;
+       ActiveLibDir =
+-          ActiveObjRoot + "/lib" + LLVM_LIBDIR_SUFFIX + "/" + build_mode;
++          ActiveObjRoot + "/" + LLVM_LIBRARY_DIR + "/" + build_mode;
+       break;
+     }
+ 
+     // We need to include files from both the source and object trees.
+     ActiveIncludeOption =
+-        ("-I" + ActiveIncludeDir + " " + "-I" + ActiveObjRoot + "/include");
++        ("-I" + ActiveIncludeDir + " " + "-I" + ActiveObjRoot + "/" + LLVM_INCLUDE_DIR);
+   } else {
+     ActivePrefix = CurrentExecPrefix;
+-    ActiveIncludeDir = ActivePrefix + "/include";
+-    ActiveBinDir = ActivePrefix + "/bin";
+-    ActiveLibDir = ActivePrefix + "/lib" + LLVM_LIBDIR_SUFFIX;
++    ActiveIncludeDir = ActivePrefix + "/" + LLVM_INCLUDE_DIR;
++    ActiveBinDir = ActivePrefix + "/" + LLVM_BINARY_DIR;
++    ActiveLibDir = ActivePrefix + "/" + LLVM_LIBRARY_DIR;
+     ActiveIncludeOption = "-I" + ActiveIncludeDir;
+   }
+ 
+diff -up llvm-3.9.1.src/tools/llvm-shlib/CMakeLists.txt.instdirs llvm-3.9.1.src/tools/llvm-shlib/CMakeLists.txt
+--- llvm-3.9.1.src/tools/llvm-shlib/CMakeLists.txt.instdirs	2016-05-25 22:35:35.000000000 -0600
++++ llvm-3.9.1.src/tools/llvm-shlib/CMakeLists.txt	2017-02-13 13:36:44.065154372 -0700
+@@ -68,7 +66,7 @@ if(LLVM_BUILD_LLVM_C_DYLIB)
+ 
+   set(LLVM_EXPORTED_SYMBOL_FILE ${CMAKE_BINARY_DIR}/libllvm-c.exports)
+ 
+-  set(LIB_DIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX})
++  set(LIB_DIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/${CMAKE_INSTALL_LIBDIR})
+   set(LIB_NAME ${LIB_DIR}/${CMAKE_SHARED_LIBRARY_PREFIX}LLVM)
+   set(LIB_PATH ${LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+   set(LIB_EXPORTS_PATH ${LIB_NAME}.exports)
+diff -up llvm-3.9.1.src/tools/lto/CMakeLists.txt.instdirs llvm-3.9.1.src/tools/lto/CMakeLists.txt
+--- llvm-3.9.1.src/tools/lto/CMakeLists.txt.instdirs	2016-07-11 21:01:22.000000000 -0600
++++ llvm-3.9.1.src/tools/lto/CMakeLists.txt	2017-02-13 13:36:44.007154709 -0700
+@@ -19,7 +19,7 @@ set(LLVM_EXPORTED_SYMBOL_FILE ${CMAKE_CU
+ add_llvm_library(LTO SHARED ${SOURCES})
+ 
+ install(FILES ${LLVM_MAIN_INCLUDE_DIR}/llvm-c/lto.h
+-  DESTINATION include/llvm-c
++  DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/llvm-c
+   COMPONENT LTO)
+ 
+ if (APPLE)
diff --git a/llvm-3.7.1-cmake-s390.patch b/llvm-3.7.1-cmake-s390.patch
new file mode 100644
index 0000000..bc9b583
--- /dev/null
+++ b/llvm-3.7.1-cmake-s390.patch
@@ -0,0 +1,12 @@
+diff -up llvm-3.7.1.src/cmake/config-ix.cmake.s390 llvm-3.7.1.src/cmake/config-ix.cmake
+--- llvm-3.7.1.src/cmake/config-ix.cmake.s390	2016-02-16 12:27:36.000000000 +0100
++++ llvm-3.7.1.src/cmake/config-ix.cmake	2016-02-16 12:27:52.000000000 +0100
+@@ -356,6 +356,8 @@ elseif (LLVM_NATIVE_ARCH MATCHES "msp430
+   set(LLVM_NATIVE_ARCH MSP430)
+ elseif (LLVM_NATIVE_ARCH MATCHES "hexagon")
+   set(LLVM_NATIVE_ARCH Hexagon)
++elseif (LLVM_NATIVE_ARCH MATCHES "s390")
++  set(LLVM_NATIVE_ARCH SystemZ)
+ elseif (LLVM_NATIVE_ARCH MATCHES "s390x")
+   set(LLVM_NATIVE_ARCH SystemZ)
+ elseif (LLVM_NATIVE_ARCH MATCHES "wasm32")
diff --git a/llvm-D23597_sdag_names.patch b/llvm-D23597_sdag_names.patch
new file mode 100644
index 0000000..9eea510
--- /dev/null
+++ b/llvm-D23597_sdag_names.patch
@@ -0,0 +1,796 @@
+Index: include/llvm/Target/TargetSelectionDAG.td
+===================================================================
+--- a/include/llvm/Target/TargetSelectionDAG.td
++++ b/include/llvm/Target/TargetSelectionDAG.td
+@@ -450,10 +450,10 @@
+ def fceil      : SDNode<"ISD::FCEIL"      , SDTFPUnaryOp>;
+ def ffloor     : SDNode<"ISD::FFLOOR"     , SDTFPUnaryOp>;
+ def fnearbyint : SDNode<"ISD::FNEARBYINT" , SDTFPUnaryOp>;
+-def frnd       : SDNode<"ISD::FROUND"     , SDTFPUnaryOp>;
++def fround     : SDNode<"ISD::FROUND"     , SDTFPUnaryOp>;
+ 
+-def fround     : SDNode<"ISD::FP_ROUND"   , SDTFPRoundOp>;
+-def fextend    : SDNode<"ISD::FP_EXTEND"  , SDTFPExtendOp>;
++def fpround    : SDNode<"ISD::FP_ROUND"   , SDTFPRoundOp>;
++def fpextend   : SDNode<"ISD::FP_EXTEND"  , SDTFPExtendOp>;
+ def fcopysign  : SDNode<"ISD::FCOPYSIGN"  , SDTFPSignOp>;
+ 
+ def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;
+Index: lib/Target/AArch64/AArch64InstrFormats.td
+===================================================================
+--- a/lib/Target/AArch64/AArch64InstrFormats.td
++++ b/lib/Target/AArch64/AArch64InstrFormats.td
+@@ -3936,27 +3936,27 @@
+ multiclass FPConversion<string asm> {
+   // Double-precision to Half-precision
+   def HDr : BaseFPConversion<0b01, 0b11, FPR16, FPR64, asm,
+-                             [(set FPR16:$Rd, (fround FPR64:$Rn))]>;
++                             [(set FPR16:$Rd, (fpround FPR64:$Rn))]>;
+ 
+   // Double-precision to Single-precision
+   def SDr : BaseFPConversion<0b01, 0b00, FPR32, FPR64, asm,
+-                             [(set FPR32:$Rd, (fround FPR64:$Rn))]>;
++                             [(set FPR32:$Rd, (fpround FPR64:$Rn))]>;
+ 
+   // Half-precision to Double-precision
+   def DHr : BaseFPConversion<0b11, 0b01, FPR64, FPR16, asm,
+-                             [(set FPR64:$Rd, (fextend FPR16:$Rn))]>;
++                             [(set FPR64:$Rd, (fpextend FPR16:$Rn))]>;
+ 
+   // Half-precision to Single-precision
+   def SHr : BaseFPConversion<0b11, 0b00, FPR32, FPR16, asm,
+-                             [(set FPR32:$Rd, (fextend FPR16:$Rn))]>;
++                             [(set FPR32:$Rd, (fpextend FPR16:$Rn))]>;
+ 
+   // Single-precision to Double-precision
+   def DSr : BaseFPConversion<0b00, 0b01, FPR64, FPR32, asm,
+-                             [(set FPR64:$Rd, (fextend FPR32:$Rn))]>;
++                             [(set FPR64:$Rd, (fpextend FPR32:$Rn))]>;
+ 
+   // Single-precision to Half-precision
+   def HSr : BaseFPConversion<0b00, 0b11, FPR16, FPR32, asm,
+-                             [(set FPR16:$Rd, (fround FPR32:$Rn))]>;
++                             [(set FPR16:$Rd, (fpround FPR32:$Rn))]>;
+ }
+ 
+ //---
+Index: lib/Target/AArch64/AArch64InstrInfo.td
+===================================================================
+--- a/lib/Target/AArch64/AArch64InstrInfo.td
++++ b/lib/Target/AArch64/AArch64InstrInfo.td
+@@ -2545,8 +2545,8 @@
+ defm : FPToIntegerPats<fp_to_uint, ffloor, "FCVTMU">;
+ defm : FPToIntegerPats<fp_to_sint, ftrunc, "FCVTZS">;
+ defm : FPToIntegerPats<fp_to_uint, ftrunc, "FCVTZU">;
+-defm : FPToIntegerPats<fp_to_sint, frnd,   "FCVTAS">;
+-defm : FPToIntegerPats<fp_to_uint, frnd,   "FCVTAU">;
++defm : FPToIntegerPats<fp_to_sint, fround, "FCVTAS">;
++defm : FPToIntegerPats<fp_to_uint, fround, "FCVTAU">;
+ 
+ //===----------------------------------------------------------------------===//
+ // Scaled integer to floating point conversion instructions.
+@@ -2582,7 +2582,7 @@
+ defm FABS   : SingleOperandFPData<0b0001, "fabs", fabs>;
+ defm FMOV   : SingleOperandFPData<0b0000, "fmov">;
+ defm FNEG   : SingleOperandFPData<0b0010, "fneg", fneg>;
+-defm FRINTA : SingleOperandFPData<0b1100, "frinta", frnd>;
++defm FRINTA : SingleOperandFPData<0b1100, "frinta", fround>;
+ defm FRINTI : SingleOperandFPData<0b1111, "frinti", fnearbyint>;
+ defm FRINTM : SingleOperandFPData<0b1010, "frintm", ffloor>;
+ defm FRINTN : SingleOperandFPData<0b1000, "frintn", int_aarch64_neon_frintn>;
+@@ -2788,13 +2788,13 @@
+ def : Pat<(v4f32 (int_aarch64_neon_vcvthf2fp (extract_subvector (v8i16 V128:$Rn),
+                                                               (i64 4)))),
+           (FCVTLv8i16 V128:$Rn)>;
+-def : Pat<(v2f64 (fextend (v2f32 V64:$Rn))), (FCVTLv2i32 V64:$Rn)>;
+-def : Pat<(v2f64 (fextend (v2f32 (extract_subvector (v4f32 V128:$Rn),
++def : Pat<(v2f64 (fpextend (v2f32 V64:$Rn))), (FCVTLv2i32 V64:$Rn)>;
++def : Pat<(v2f64 (fpextend (v2f32 (extract_subvector (v4f32 V128:$Rn),
+                                                     (i64 2))))),
+           (FCVTLv4i32 V128:$Rn)>;
+ 
+-def : Pat<(v4f32 (fextend (v4f16 V64:$Rn))), (FCVTLv4i16 V64:$Rn)>;
+-def : Pat<(v4f32 (fextend (v4f16 (extract_subvector (v8f16 V128:$Rn),
++def : Pat<(v4f32 (fpextend (v4f16 V64:$Rn))), (FCVTLv4i16 V64:$Rn)>;
++def : Pat<(v4f32 (fpextend (v4f16 (extract_subvector (v8f16 V128:$Rn),
+                                                     (i64 4))))),
+           (FCVTLv8i16 V128:$Rn)>;
+ 
+@@ -2808,9 +2808,9 @@
+ def : Pat<(concat_vectors V64:$Rd,
+                           (v4i16 (int_aarch64_neon_vcvtfp2hf (v4f32 V128:$Rn)))),
+           (FCVTNv8i16 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;
+-def : Pat<(v2f32 (fround (v2f64 V128:$Rn))), (FCVTNv2i32 V128:$Rn)>;
+-def : Pat<(v4f16 (fround (v4f32 V128:$Rn))), (FCVTNv4i16 V128:$Rn)>;
+-def : Pat<(concat_vectors V64:$Rd, (v2f32 (fround (v2f64 V128:$Rn)))),
++def : Pat<(v2f32 (fpround (v2f64 V128:$Rn))), (FCVTNv2i32 V128:$Rn)>;
++def : Pat<(v4f16 (fpround (v4f32 V128:$Rn))), (FCVTNv4i16 V128:$Rn)>;
++def : Pat<(concat_vectors V64:$Rd, (v2f32 (fpround (v2f64 V128:$Rn)))),
+           (FCVTNv4i32 (INSERT_SUBREG (IMPLICIT_DEF), V64:$Rd, dsub), V128:$Rn)>;
+ defm FCVTPS : SIMDTwoVectorFPToInt<0,1,0b11010, "fcvtps",int_aarch64_neon_fcvtps>;
+ defm FCVTPU : SIMDTwoVectorFPToInt<1,1,0b11010, "fcvtpu",int_aarch64_neon_fcvtpu>;
+@@ -2833,7 +2833,7 @@
+ 
+ defm FNEG   : SIMDTwoVectorFP<1, 1, 0b01111, "fneg", fneg>;
+ defm FRECPE : SIMDTwoVectorFP<0, 1, 0b11101, "frecpe", int_aarch64_neon_frecpe>;
+-defm FRINTA : SIMDTwoVectorFP<1, 0, 0b11000, "frinta", frnd>;
++defm FRINTA : SIMDTwoVectorFP<1, 0, 0b11000, "frinta", fround>;
+ defm FRINTI : SIMDTwoVectorFP<1, 1, 0b11001, "frinti", fnearbyint>;
+ defm FRINTM : SIMDTwoVectorFP<0, 0, 0b11001, "frintm", ffloor>;
+ defm FRINTN : SIMDTwoVectorFP<0, 0, 0b11000, "frintn", int_aarch64_neon_frintn>;
+Index: lib/Target/AMDGPU/SIInstructions.td
+===================================================================
+--- a/lib/Target/AMDGPU/SIInstructions.td
++++ b/lib/Target/AMDGPU/SIInstructions.td
+@@ -1107,10 +1107,10 @@
+   VOP_I32_F32, cvt_flr_i32_f32>;
+ defm V_CVT_OFF_F32_I4 : VOP1Inst  <vop1<0x0e>, "v_cvt_off_f32_i4", VOP_F32_I32>;
+ defm V_CVT_F32_F64 : VOP1Inst <vop1<0xf>, "v_cvt_f32_f64",
+-  VOP_F32_F64, fround
++  VOP_F32_F64, fpround
+ >;
+ defm V_CVT_F64_F32 : VOP1Inst <vop1<0x10>, "v_cvt_f64_f32",
+-  VOP_F64_F32, fextend
++  VOP_F64_F32, fpextend
+ >;
+ defm V_CVT_F32_UBYTE0 : VOP1Inst <vop1<0x11>, "v_cvt_f32_ubyte0",
+   VOP_F32_I32, AMDGPUcvt_f32_ubyte0
+Index: lib/Target/ARM/ARMInstrVFP.td
+===================================================================
+--- a/lib/Target/ARM/ARMInstrVFP.td
++++ b/lib/Target/ARM/ARMInstrVFP.td
+@@ -624,7 +624,7 @@
+ def VCVTDS  : ASuI<0b11101, 0b11, 0b0111, 0b11, 0,
+                    (outs DPR:$Dd), (ins SPR:$Sm),
+                    IIC_fpCVTDS, "vcvt", ".f64.f32\t$Dd, $Sm",
+-                   [(set DPR:$Dd, (fextend SPR:$Sm))]> {
++                   [(set DPR:$Dd, (fpextend SPR:$Sm))]> {
+   // Instruction operands.
+   bits<5> Dd;
+   bits<5> Sm;
+@@ -641,7 +641,7 @@
+ // Special case encoding: bits 11-8 is 0b1011.
+ def VCVTSD  : VFPAI<(outs SPR:$Sd), (ins DPR:$Dm), VFPUnaryFrm,
+                     IIC_fpCVTSD, "vcvt", ".f32.f64\t$Sd, $Dm",
+-                    [(set SPR:$Sd, (fround DPR:$Dm))]> {
++                    [(set SPR:$Sd, (fpround DPR:$Dm))]> {
+   // Instruction operands.
+   bits<5> Sd;
+   bits<5> Dm;
+@@ -838,7 +838,7 @@
+   }
+ }
+ 
+-defm VCVTA : vcvt_inst<"a", 0b00, frnd>;
++defm VCVTA : vcvt_inst<"a", 0b00, fround>;
+ defm VCVTN : vcvt_inst<"n", 0b01>;
+ defm VCVTP : vcvt_inst<"p", 0b10, fceil>;
+ defm VCVTM : vcvt_inst<"m", 0b11, ffloor>;
+@@ -938,7 +938,7 @@
+         Requires<[HasFPARMv8,HasDPVFP]>;
+ }
+ 
+-defm VRINTA : vrint_inst_anpm<"a", 0b00, frnd>;
++defm VRINTA : vrint_inst_anpm<"a", 0b00, fround>;
+ defm VRINTN : vrint_inst_anpm<"n", 0b01>;
+ defm VRINTP : vrint_inst_anpm<"p", 0b10, fceil>;
+ defm VRINTM : vrint_inst_anpm<"m", 0b11, ffloor>;
+Index: lib/Target/Hexagon/HexagonISelLowering.cpp
+===================================================================
+--- a/lib/Target/Hexagon/HexagonISelLowering.cpp
++++ b/lib/Target/Hexagon/HexagonISelLowering.cpp
+@@ -1906,7 +1906,7 @@
+   }
+   // Turn FP truncstore into trunc + store.
+   setTruncStoreAction(MVT::f64, MVT::f32, Expand);
+-  // Turn FP extload into load/fextend.
++  // Turn FP extload into load/fpextend.
+   for (MVT VT : MVT::fp_valuetypes())
+     setLoadExtAction(ISD::EXTLOAD, VT, MVT::f32, Expand);
+ 
+Index: lib/Target/Hexagon/HexagonInstrInfoV5.td
+===================================================================
+--- a/lib/Target/Hexagon/HexagonInstrInfoV5.td
++++ b/lib/Target/Hexagon/HexagonInstrInfoV5.td
+@@ -564,10 +564,10 @@
+ 
+ // Convert single precision to double precision and vice-versa.
+ def F2_conv_sf2df : F2_RDD_RS_CONVERT <"convert_sf2df", 0b000,
+-                                       fextend, F64, F32>;
++                                       fpextend, F64, F32>;
+ 
+ def F2_conv_df2sf : F2_RD_RSS_CONVERT <"convert_df2sf", 0b000,
+-                                       fround, F32, F64>;
++                                       fpround, F32, F64>;
+ 
+ // Convert Integer to Floating Point.
+ def F2_conv_d2sf : F2_RD_RSS_CONVERT <"convert_d2sf", 0b010,
+Index: lib/Target/Mips/MipsInstrFPU.td
+===================================================================
+--- a/lib/Target/Mips/MipsInstrFPU.td
++++ b/lib/Target/Mips/MipsInstrFPU.td
+@@ -635,9 +635,9 @@
+               (PseudoCVT_D32_W GPR32Opnd:$src)>, FGR_32;
+ def : MipsPat<(MipsTruncIntFP AFGR64Opnd:$src),
+               (TRUNC_W_D32 AFGR64Opnd:$src)>, FGR_32;
+-def : MipsPat<(f32 (fround AFGR64Opnd:$src)),
++def : MipsPat<(f32 (fpround AFGR64Opnd:$src)),
+               (CVT_S_D32 AFGR64Opnd:$src)>, FGR_32;
+-def : MipsPat<(f64 (fextend FGR32Opnd:$src)),
++def : MipsPat<(f64 (fpextend FGR32Opnd:$src)),
+               (CVT_D32_S FGR32Opnd:$src)>, FGR_32;
+ 
+ def : MipsPat<(f64 fpimm0), (DMTC1 ZERO_64)>, FGR_64;
+@@ -657,9 +657,9 @@
+ def : MipsPat<(MipsTruncIntFP FGR64Opnd:$src),
+               (TRUNC_L_D64 FGR64Opnd:$src)>, FGR_64;
+ 
+-def : MipsPat<(f32 (fround FGR64Opnd:$src)),
++def : MipsPat<(f32 (fpround FGR64Opnd:$src)),
+               (CVT_S_D64 FGR64Opnd:$src)>, FGR_64;
+-def : MipsPat<(f64 (fextend FGR32Opnd:$src)),
++def : MipsPat<(f64 (fpextend FGR32Opnd:$src)),
+               (CVT_D64_S FGR32Opnd:$src)>, FGR_64;
+ 
+ // Patterns for loads/stores with a reg+imm operand.
+Index: lib/Target/NVPTX/NVPTXISelLowering.cpp
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXISelLowering.cpp
++++ b/lib/Target/NVPTX/NVPTXISelLowering.cpp
+@@ -206,7 +206,7 @@
+   // intrinsics.
+   setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);
+ 
+-  // Turn FP extload into load/fextend
++  // Turn FP extload into load/fpextend
+   setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::f16, Expand);
+   setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f16, Expand);
+   setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);
+Index: lib/Target/NVPTX/NVPTXInstrInfo.td
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXInstrInfo.td
++++ b/lib/Target/NVPTX/NVPTXInstrInfo.td
+@@ -2613,16 +2613,16 @@
+ def : Pat<(ctpop Int16Regs:$a),
+           (CVT_u16_u32 (POPCr32 (CVT_u32_u16 Int16Regs:$a, CvtNONE)), CvtNONE)>;
+ 
+-// fround f64 -> f32
+-def : Pat<(f32 (fround Float64Regs:$a)),
++// fpround f64 -> f32
++def : Pat<(f32 (fpround Float64Regs:$a)),
+           (CVT_f32_f64 Float64Regs:$a, CvtRN_FTZ)>, Requires<[doF32FTZ]>;
+-def : Pat<(f32 (fround Float64Regs:$a)),
++def : Pat<(f32 (fpround Float64Regs:$a)),
+           (CVT_f32_f64 Float64Regs:$a, CvtRN)>;
+ 
+-// fextend f32 -> f64
+-def : Pat<(f64 (fextend Float32Regs:$a)),
++// fpextend f32 -> f64
++def : Pat<(f64 (fpextend Float32Regs:$a)),
+           (CVT_f64_f32 Float32Regs:$a, CvtNONE_FTZ)>, Requires<[doF32FTZ]>;
+-def : Pat<(f64 (fextend Float32Regs:$a)),
++def : Pat<(f64 (fpextend Float32Regs:$a)),
+           (CVT_f64_f32 Float32Regs:$a, CvtNONE)>;
+ 
+ def retflag : SDNode<"NVPTXISD::RET_FLAG", SDTNone,
+Index: lib/Target/PowerPC/PPCInstrInfo.td
+===================================================================
+--- a/lib/Target/PowerPC/PPCInstrInfo.td
++++ b/lib/Target/PowerPC/PPCInstrInfo.td
+@@ -2110,15 +2110,15 @@
+ 
+   defm FRSP   : XForm_26r<63, 12, (outs f4rc:$frD), (ins f8rc:$frB),
+                           "frsp", "$frD, $frB", IIC_FPGeneral,
+-                          [(set f32:$frD, (fround f64:$frB))]>;
++                          [(set f32:$frD, (fpround f64:$frB))]>;
+ 
+   let Interpretation64Bit = 1, isCodeGenOnly = 1 in
+   defm FRIND  : XForm_26r<63, 392, (outs f8rc:$frD), (ins f8rc:$frB),
+                           "frin", "$frD, $frB", IIC_FPGeneral,
+-                          [(set f64:$frD, (frnd f64:$frB))]>;
++                          [(set f64:$frD, (fround f64:$frB))]>;
+   defm FRINS  : XForm_26r<63, 392, (outs f4rc:$frD), (ins f4rc:$frB),
+                           "frin", "$frD, $frB", IIC_FPGeneral,
+-                          [(set f32:$frD, (frnd f32:$frB))]>;
++                          [(set f32:$frD, (fround f32:$frB))]>;
+   }
+ 
+   let hasSideEffects = 0 in {
+@@ -2856,7 +2856,7 @@
+ def : Pat<(f64 (extloadf32 xaddr:$src)),
+           (COPY_TO_REGCLASS (LFSX xaddr:$src), F8RC)>;
+ 
+-def : Pat<(f64 (fextend f32:$src)),
++def : Pat<(f64 (fpextend f32:$src)),
+           (COPY_TO_REGCLASS $src, F8RC)>;
+ 
+ // Only seq_cst fences require the heavyweight sync (SYNC 0).
+Index: lib/Target/PowerPC/PPCInstrQPX.td
+===================================================================
+--- a/lib/Target/PowerPC/PPCInstrQPX.td
++++ b/lib/Target/PowerPC/PPCInstrQPX.td
+@@ -88,11 +88,11 @@
+   return cast<StoreSDNode>(N)->getMemoryVT() == MVT::v4f32;
+ }]>;
+ 
+-def fround_inexact : PatFrag<(ops node:$val), (fround node:$val), [{
++def fround_inexact : PatFrag<(ops node:$val), (fpround node:$val), [{
+   return cast<ConstantSDNode>(N->getOperand(1))->getZExtValue() == 0;
+ }]>;
+ 
+-def fround_exact : PatFrag<(ops node:$val), (fround node:$val), [{
++def fround_exact : PatFrag<(ops node:$val), (fpround node:$val), [{
+   return cast<ConstantSDNode>(N->getOperand(1))->getZExtValue() == 1;
+ }]>;
+ 
+@@ -311,11 +311,11 @@
+ 
+   def QVFRIN : XForm_19<4, 392, (outs qfrc:$FRT), (ins qfrc:$FRB),
+                         "qvfrin $FRT, $FRB", IIC_FPGeneral,
+-                        [(set v4f64:$FRT, (frnd v4f64:$FRB))]>;
++                        [(set v4f64:$FRT, (fround v4f64:$FRB))]>;
+   let isCodeGenOnly = 1 in
+     def QVFRINs : XForm_19<4, 392, (outs qsrc:$FRT), (ins qsrc:$FRB),
+                            "qvfrin $FRT, $FRB", IIC_FPGeneral,
+-                           [(set v4f32:$FRT, (frnd v4f32:$FRB))]>;
++                           [(set v4f32:$FRT, (fround v4f32:$FRB))]>;
+ 
+   def QVFRIP : XForm_19<4, 456, (outs qfrc:$FRT), (ins qfrc:$FRB),
+                         "qvfrip $FRT, $FRB", IIC_FPGeneral,
+@@ -1103,7 +1103,7 @@
+ def : Pat<(not v4i1:$FRA),
+           (QVFLOGICALb $FRA, $FRA, (i32 10))>;
+ 
+-def : Pat<(v4f64 (fextend v4f32:$src)),
++def : Pat<(v4f64 (fpextend v4f32:$src)),
+           (COPY_TO_REGCLASS $src, QFRC)>;
+ 
+ def : Pat<(v4f32 (fround_exact v4f64:$src)),
+Index: lib/Target/PowerPC/PPCInstrVSX.td
+===================================================================
+--- a/lib/Target/PowerPC/PPCInstrVSX.td
++++ b/lib/Target/PowerPC/PPCInstrVSX.td
+@@ -634,7 +634,7 @@
+   def XSRDPI : XX2Form<60, 73,
+                       (outs vsfrc:$XT), (ins vsfrc:$XB),
+                       "xsrdpi $XT, $XB", IIC_VecFP,
+-                      [(set f64:$XT, (frnd f64:$XB))]>;
++                      [(set f64:$XT, (fround f64:$XB))]>;
+   def XSRDPIC : XX2Form<60, 107,
+                       (outs vsfrc:$XT), (ins vsfrc:$XB),
+                       "xsrdpic $XT, $XB", IIC_VecFP,
+@@ -655,7 +655,7 @@
+   def XVRDPI : XX2Form<60, 201,
+                       (outs vsrc:$XT), (ins vsrc:$XB),
+                       "xvrdpi $XT, $XB", IIC_VecFP,
+-                      [(set v2f64:$XT, (frnd v2f64:$XB))]>;
++                      [(set v2f64:$XT, (fround v2f64:$XB))]>;
+   def XVRDPIC : XX2Form<60, 235,
+                       (outs vsrc:$XT), (ins vsrc:$XB),
+                       "xvrdpic $XT, $XB", IIC_VecFP,
+@@ -676,7 +676,7 @@
+   def XVRSPI : XX2Form<60, 137,
+                       (outs vsrc:$XT), (ins vsrc:$XB),
+                       "xvrspi $XT, $XB", IIC_VecFP,
+-                      [(set v4f32:$XT, (frnd v4f32:$XB))]>;
++                      [(set v4f32:$XT, (fround v4f32:$XB))]>;
+   def XVRSPIC : XX2Form<60, 171,
+                       (outs vsrc:$XT), (ins vsrc:$XB),
+                       "xvrspic $XT, $XB", IIC_VecFP,
+@@ -1108,7 +1108,7 @@
+ 
+   def : Pat<(f64 (extloadf32 xoaddr:$src)),
+             (COPY_TO_REGCLASS (LXSSPX xoaddr:$src), VSFRC)>;
+-  def : Pat<(f64 (fextend f32:$src)),
++  def : Pat<(f64 (fpextend f32:$src)),
+             (COPY_TO_REGCLASS $src, VSFRC)>;
+ 
+   def : Pat<(f32 (selectcc i1:$lhs, i1:$rhs, f32:$tval, f32:$fval, SETLT)),
+Index: lib/Target/Sparc/SparcISelLowering.cpp
+===================================================================
+--- a/lib/Target/Sparc/SparcISelLowering.cpp
++++ b/lib/Target/Sparc/SparcISelLowering.cpp
+@@ -1508,7 +1508,7 @@
+     //    AddPromotedToType(ISD::STORE, MVT::i64, MVT::v2i32);
+   }
+ 
+-  // Turn FP extload into load/fextend
++  // Turn FP extload into load/fpextend
+   for (MVT VT : MVT::fp_valuetypes()) {
+     setLoadExtAction(ISD::EXTLOAD, VT, MVT::f32, Expand);
+     setLoadExtAction(ISD::EXTLOAD, VT, MVT::f64, Expand);
+Index: lib/Target/Sparc/SparcInstrInfo.td
+===================================================================
+--- a/lib/Target/Sparc/SparcInstrInfo.td
++++ b/lib/Target/Sparc/SparcInstrInfo.td
+@@ -1131,32 +1131,32 @@
+ def FSTOD : F3_3u<2, 0b110100, 0b011001001,
+                  (outs DFPRegs:$rd), (ins FPRegs:$rs2),
+                  "fstod $rs2, $rd",
+-                 [(set f64:$rd, (fextend f32:$rs2))],
++                 [(set f64:$rd, (fpextend f32:$rs2))],
+                  IIC_fpu_stod>;
+ def FSTOQ : F3_3u<2, 0b110100, 0b011001101,
+                  (outs QFPRegs:$rd), (ins FPRegs:$rs2),
+                  "fstoq $rs2, $rd",
+-                 [(set f128:$rd, (fextend f32:$rs2))]>,
++                 [(set f128:$rd, (fpextend f32:$rs2))]>,
+                  Requires<[HasHardQuad]>;
+ def FDTOS : F3_3u<2, 0b110100, 0b011000110,
+                  (outs FPRegs:$rd), (ins DFPRegs:$rs2),
+                  "fdtos $rs2, $rd",
+-                 [(set f32:$rd, (fround f64:$rs2))],
++                 [(set f32:$rd, (fpround f64:$rs2))],
+                  IIC_fpu_fast_instr>;
+ def FDTOQ : F3_3u<2, 0b110100, 0b011001110,
+                  (outs QFPRegs:$rd), (ins DFPRegs:$rs2),
+                  "fdtoq $rs2, $rd",
+-                 [(set f128:$rd, (fextend f64:$rs2))]>,
++                 [(set f128:$rd, (fpextend f64:$rs2))]>,
+                  Requires<[HasHardQuad]>;
+ def FQTOS : F3_3u<2, 0b110100, 0b011000111,
+                  (outs FPRegs:$rd), (ins QFPRegs:$rs2),
+                  "fqtos $rs2, $rd",
+-                 [(set f32:$rd, (fround f128:$rs2))]>,
++                 [(set f32:$rd, (fpround f128:$rs2))]>,
+                  Requires<[HasHardQuad]>;
+ def FQTOD : F3_3u<2, 0b110100, 0b011001011,
+                  (outs DFPRegs:$rd), (ins QFPRegs:$rs2),
+                  "fqtod $rs2, $rd",
+-                 [(set f64:$rd, (fround f128:$rs2))]>,
++                 [(set f64:$rd, (fpround f128:$rs2))]>,
+                  Requires<[HasHardQuad]>;
+ 
+ // Floating-point Move Instructions, p. 144
+@@ -1255,14 +1255,14 @@
+ def FSMULD : F3_3<2, 0b110100, 0b001101001,
+                   (outs DFPRegs:$rd), (ins FPRegs:$rs1, FPRegs:$rs2),
+                   "fsmuld $rs1, $rs2, $rd",
+-                  [(set f64:$rd, (fmul (fextend f32:$rs1),
+-                                        (fextend f32:$rs2)))],
++                  [(set f64:$rd, (fmul (fpextend f32:$rs1),
++                                        (fpextend f32:$rs2)))],
+                   IIC_fpu_muld>;
+ def FDMULQ : F3_3<2, 0b110100, 0b001101110,
+                   (outs QFPRegs:$rd), (ins DFPRegs:$rs1, DFPRegs:$rs2),
+                   "fdmulq $rs1, $rs2, $rd",
+-                  [(set f128:$rd, (fmul (fextend f64:$rs1),
+-                                         (fextend f64:$rs2)))]>,
++                  [(set f128:$rd, (fmul (fpextend f64:$rs1),
++                                         (fpextend f64:$rs2)))]>,
+                   Requires<[HasHardQuad]>;
+ 
+ // FDIVS generates an erratum on LEON processors, so by disabling this instruction
+Index: lib/Target/SystemZ/SystemZISelLowering.cpp
+===================================================================
+--- a/lib/Target/SystemZ/SystemZISelLowering.cpp
++++ b/lib/Target/SystemZ/SystemZISelLowering.cpp
+@@ -4995,8 +4995,8 @@
+ 
+ SDValue SystemZTargetLowering::combineFP_ROUND(
+     SDNode *N, DAGCombinerInfo &DCI) const {
+-  // (fround (extract_vector_elt X 0))
+-  // (fround (extract_vector_elt X 1)) ->
++  // (fpround (extract_vector_elt X 0))
++  // (fpround (extract_vector_elt X 1)) ->
+   // (extract_vector_elt (VROUND X) 0)
+   // (extract_vector_elt (VROUND X) 1)
+   //
+Index: lib/Target/SystemZ/SystemZInstrFP.td
+===================================================================
+--- a/lib/Target/SystemZ/SystemZInstrFP.td
++++ b/lib/Target/SystemZ/SystemZInstrFP.td
+@@ -154,7 +154,7 @@
+ // Convert floating-point values to narrower representations, rounding
+ // according to the current mode.  The destination of LEXBR and LDXBR
+ // is a 128-bit value, but only the first register of the pair is used.
+-def LEDBR : UnaryRRE<"ledb", 0xB344, fround,    FP32,  FP64>;
++def LEDBR : UnaryRRE<"ledb", 0xB344, fpround,    FP32,  FP64>;
+ def LEXBR : UnaryRRE<"lexb", 0xB346, null_frag, FP128, FP128>;
+ def LDXBR : UnaryRRE<"ldxb", 0xB345, null_frag, FP128, FP128>;
+ 
+@@ -165,15 +165,15 @@
+ def LDXBRA : UnaryRRF4<"ldxbra", 0xB345, FP128, FP128>,
+              Requires<[FeatureFPExtension]>;
+ 
+-def : Pat<(f32 (fround FP128:$src)),
++def : Pat<(f32 (fpround FP128:$src)),
+           (EXTRACT_SUBREG (LEXBR FP128:$src), subreg_hr32)>;
+-def : Pat<(f64 (fround FP128:$src)),
++def : Pat<(f64 (fpround FP128:$src)),
+           (EXTRACT_SUBREG (LDXBR FP128:$src), subreg_h64)>;
+ 
+ // Extend register floating-point values to wider representations.
+-def LDEBR : UnaryRRE<"ldeb", 0xB304, fextend, FP64,  FP32>;
+-def LXEBR : UnaryRRE<"lxeb", 0xB306, fextend, FP128, FP32>;
+-def LXDBR : UnaryRRE<"lxdb", 0xB305, fextend, FP128, FP64>;
++def LDEBR : UnaryRRE<"ldeb", 0xB304, fpextend, FP64,  FP32>;
++def LXEBR : UnaryRRE<"lxeb", 0xB306, fpextend, FP128, FP32>;
++def LXDBR : UnaryRRE<"lxdb", 0xB305, fpextend, FP128, FP64>;
+ 
+ // Extend memory floating-point values to wider representations.
+ def LDEB : UnaryRXE<"ldeb", 0xED04, extloadf32, FP64,  4>;
+@@ -347,9 +347,9 @@
+ 
+   // Same idea for round, where mode 1 is round towards nearest with
+   // ties away from zero.
+-  def : Pat<(frnd FP32:$src),  (FIEBRA 1, FP32:$src,  4)>;
+-  def : Pat<(frnd FP64:$src),  (FIDBRA 1, FP64:$src,  4)>;
+-  def : Pat<(frnd FP128:$src), (FIXBRA 1, FP128:$src, 4)>;
++  def : Pat<(fround FP32:$src),  (FIEBRA 1, FP32:$src,  4)>;
++  def : Pat<(fround FP64:$src),  (FIDBRA 1, FP64:$src,  4)>;
++  def : Pat<(fround FP128:$src), (FIXBRA 1, FP128:$src, 4)>;
+ }
+ 
+ //===----------------------------------------------------------------------===//
+@@ -388,26 +388,26 @@
+ 
+ // f64 multiplication of two FP32 registers.
+ def MDEBR : BinaryRRE<"mdeb", 0xB30C, null_frag, FP64, FP32>;
+-def : Pat<(fmul (f64 (fextend FP32:$src1)), (f64 (fextend FP32:$src2))),
++def : Pat<(fmul (f64 (fpextend FP32:$src1)), (f64 (fpextend FP32:$src2))),
+           (MDEBR (INSERT_SUBREG (f64 (IMPLICIT_DEF)),
+                                 FP32:$src1, subreg_r32), FP32:$src2)>;
+ 
+ // f64 multiplication of an FP32 register and an f32 memory.
+ def MDEB : BinaryRXE<"mdeb", 0xED0C, null_frag, FP64, load, 4>;
+-def : Pat<(fmul (f64 (fextend FP32:$src1)),
++def : Pat<(fmul (f64 (fpextend FP32:$src1)),
+                 (f64 (extloadf32 bdxaddr12only:$addr))),
+           (MDEB (INSERT_SUBREG (f64 (IMPLICIT_DEF)), FP32:$src1, subreg_r32),
+                 bdxaddr12only:$addr)>;
+ 
+ // f128 multiplication of two FP64 registers.
+ def MXDBR : BinaryRRE<"mxdb", 0xB307, null_frag, FP128, FP64>;
+-def : Pat<(fmul (f128 (fextend FP64:$src1)), (f128 (fextend FP64:$src2))),
++def : Pat<(fmul (f128 (fpextend FP64:$src1)), (f128 (fpextend FP64:$src2))),
+           (MXDBR (INSERT_SUBREG (f128 (IMPLICIT_DEF)),
+                                 FP64:$src1, subreg_h64), FP64:$src2)>;
+ 
+ // f128 multiplication of an FP64 register and an f64 memory.
+ def MXDB : BinaryRXE<"mxdb", 0xED07, null_frag, FP128, load, 8>;
+-def : Pat<(fmul (f128 (fextend FP64:$src1)),
++def : Pat<(fmul (f128 (fpextend FP64:$src1)),
+                 (f128 (extloadf64 bdxaddr12only:$addr))),
+           (MXDB (INSERT_SUBREG (f128 (IMPLICIT_DEF)), FP64:$src1, subreg_h64),
+                 bdxaddr12only:$addr)>;
+Index: lib/Target/SystemZ/SystemZInstrVector.td
+===================================================================
+--- a/lib/Target/SystemZ/SystemZInstrVector.td
++++ b/lib/Target/SystemZ/SystemZInstrVector.td
+@@ -798,7 +798,7 @@
+   def : FPConversion<insn, ffloor,     tr, tr, 4, 7>;
+   def : FPConversion<insn, fceil,      tr, tr, 4, 6>;
+   def : FPConversion<insn, ftrunc,     tr, tr, 4, 5>;
+-  def : FPConversion<insn, frnd,       tr, tr, 4, 1>;
++  def : FPConversion<insn, fround,     tr, tr, 4, 1>;
+ }
+ 
+ let Predicates = [FeatureVector] in {
+@@ -840,13 +840,13 @@
+ 
+   // Load lengthened.
+   def VLDEB : UnaryVRRa<"vldeb", 0xE7C4, z_vextend, v128db, v128eb, 2, 0>;
+-  def WLDEB : UnaryVRRa<"wldeb", 0xE7C4, fextend, v64db, v32eb, 2, 8>;
++  def WLDEB : UnaryVRRa<"wldeb", 0xE7C4, fpextend, v64db, v32eb, 2, 8>;
+ 
+   // Load rounded,
+   def VLEDB : TernaryVRRa<"vledb", 0xE7C5, null_frag, v128eb, v128db, 3, 0>;
+   def WLEDB : TernaryVRRa<"wledb", 0xE7C5, null_frag, v32eb, v64db, 3, 8>;
+   def : Pat<(v4f32 (z_vround (v2f64 VR128:$src))), (VLEDB VR128:$src, 0, 0)>;
+-  def : FPConversion<WLEDB, fround, v32eb, v64db, 0, 0>;
++  def : FPConversion<WLEDB, fpround, v32eb, v64db, 0, 0>;
+ 
+   // Multiply.
+   def VFMDB : BinaryVRRc<"vfmdb", 0xE7E7, fmul, v128db, v128db, 3, 0>;
+Index: lib/Target/WebAssembly/WebAssemblyInstrConv.td
+===================================================================
+--- a/lib/Target/WebAssembly/WebAssemblyInstrConv.td
++++ b/lib/Target/WebAssembly/WebAssemblyInstrConv.td
+@@ -89,10 +89,10 @@
+                           "f64.convert_u/i64\t$dst, $src">;
+ 
+ def F64_PROMOTE_F32 : I<(outs F64:$dst), (ins F32:$src),
+-                        [(set F64:$dst, (fextend F32:$src))],
++                        [(set F64:$dst, (fpextend F32:$src))],
+                         "f64.promote/f32\t$dst, $src">;
+ def F32_DEMOTE_F64 : I<(outs F32:$dst), (ins F64:$src),
+-                       [(set F32:$dst, (fround F64:$src))],
++                       [(set F32:$dst, (fpround F64:$src))],
+                        "f32.demote/f64\t$dst, $src">;
+ 
+ def I32_REINTERPRET_F32 : I<(outs I32:$dst), (ins F32:$src),
+Index: lib/Target/X86/X86InstrAVX512.td
+===================================================================
+--- a/lib/Target/X86/X86InstrAVX512.td
++++ b/lib/Target/X86/X86InstrAVX512.td
+@@ -5595,11 +5595,11 @@
+ defm VCVTSS2SD : avx512_cvt_fp_scalar_ss2sd<0x5A, "vcvtss2sd", X86fpext,
+                                           X86fpextRnd,f32x_info, f64x_info >;
+ 
+-def : Pat<(f64 (fextend FR32X:$src)),
++def : Pat<(f64 (fpextend FR32X:$src)),
+           (COPY_TO_REGCLASS (VCVTSS2SDZrr (COPY_TO_REGCLASS FR32X:$src, VR128X),
+                                (COPY_TO_REGCLASS FR32X:$src, VR128X)), VR128X)>,
+           Requires<[HasAVX512]>;
+-def : Pat<(f64 (fextend (loadf32 addr:$src))),
++def : Pat<(f64 (fpextend (loadf32 addr:$src))),
+           (COPY_TO_REGCLASS (VCVTSS2SDZrm (v4f32 (IMPLICIT_DEF)), addr:$src), VR128X)>,
+           Requires<[HasAVX512]>;
+ 
+@@ -5612,7 +5612,7 @@
+                     (COPY_TO_REGCLASS (VMOVSSZrm addr:$src), VR128X)), VR128X)>,
+           Requires<[HasAVX512, OptForSpeed]>;
+ 
+-def : Pat<(f32 (fround FR64X:$src)),
++def : Pat<(f32 (fpround FR64X:$src)),
+           (COPY_TO_REGCLASS (VCVTSD2SSZrr (COPY_TO_REGCLASS FR64X:$src, VR128X),
+                     (COPY_TO_REGCLASS FR64X:$src, VR128X)), VR128X)>,
+            Requires<[HasAVX512]>;
+@@ -5666,29 +5666,29 @@
+ // Extend Float to Double
+ multiclass avx512_cvtps2pd<bits<8> opc, string OpcodeStr> {
+   let Predicates = [HasAVX512] in {
+-    defm Z : avx512_vcvt_fp<opc, OpcodeStr, v8f64_info, v8f32x_info, fextend>,
++    defm Z : avx512_vcvt_fp<opc, OpcodeStr, v8f64_info, v8f32x_info, fpextend>,
+              avx512_vcvt_fp_sae<opc, OpcodeStr, v8f64_info, v8f32x_info,
+                                 X86vfpextRnd>, EVEX_V512;
+   }
+   let Predicates = [HasVLX] in {
+     defm Z128 : avx512_vcvt_fp<opc, OpcodeStr, v2f64x_info, v4f32x_info,
+                                X86vfpext, "{1to2}">, EVEX_V128;
+-    defm Z256 : avx512_vcvt_fp<opc, OpcodeStr, v4f64x_info, v4f32x_info, fextend>,
++    defm Z256 : avx512_vcvt_fp<opc, OpcodeStr, v4f64x_info, v4f32x_info, fpextend>,
+                                      EVEX_V256;
+   }
+ }
+ 
+ // Truncate Double to Float
+ multiclass avx512_cvtpd2ps<bits<8> opc, string OpcodeStr> {
+   let Predicates = [HasAVX512] in {
+-    defm Z : avx512_vcvt_fp<opc, OpcodeStr, v8f32x_info, v8f64_info, fround>,
++    defm Z : avx512_vcvt_fp<opc, OpcodeStr, v8f32x_info, v8f64_info, fpround>,
+              avx512_vcvt_fp_rc<opc, OpcodeStr, v8f32x_info, v8f64_info,
+                                X86vfproundRnd>, EVEX_V512;
+   }
+   let Predicates = [HasVLX] in {
+     defm Z128 : avx512_vcvt_fp<opc, OpcodeStr, v4f32x_info, v2f64x_info,
+                                X86vfpround, "{1to2}", "{x}">, EVEX_V128;
+-    defm Z256 : avx512_vcvt_fp<opc, OpcodeStr, v4f32x_info, v4f64x_info, fround,
++    defm Z256 : avx512_vcvt_fp<opc, OpcodeStr, v4f32x_info, v4f64x_info, fpround,
+                                "{1to4}", "{y}">, EVEX_V256;
+   }
+ }
+@@ -6025,7 +6025,7 @@
+ }
+ 
+ let Predicates = [HasAVX512] in {
+-  def : Pat<(v8f32 (fround (loadv8f64 addr:$src))),
++  def : Pat<(v8f32 (fpround (loadv8f64 addr:$src))),
+             (VCVTPD2PSZrm addr:$src)>;
+   def : Pat<(v8f64 (extloadv8f32 addr:$src)),
+             (VCVTPS2PDZrm addr:$src)>;
+Index: lib/Target/X86/X86InstrFPStack.td
+===================================================================
+--- a/lib/Target/X86/X86InstrFPStack.td
++++ b/lib/Target/X86/X86InstrFPStack.td
+@@ -711,19 +711,19 @@
+ 
+ // FP extensions map onto simple pseudo-value conversions if they are to/from
+ // the FP stack.
+-def : Pat<(f64 (fextend RFP32:$src)), (COPY_TO_REGCLASS RFP32:$src, RFP64)>,
++def : Pat<(f64 (fpextend RFP32:$src)), (COPY_TO_REGCLASS RFP32:$src, RFP64)>,
+           Requires<[FPStackf32]>;
+-def : Pat<(f80 (fextend RFP32:$src)), (COPY_TO_REGCLASS RFP32:$src, RFP80)>,
++def : Pat<(f80 (fpextend RFP32:$src)), (COPY_TO_REGCLASS RFP32:$src, RFP80)>,
+            Requires<[FPStackf32]>;
+-def : Pat<(f80 (fextend RFP64:$src)), (COPY_TO_REGCLASS RFP64:$src, RFP80)>,
++def : Pat<(f80 (fpextend RFP64:$src)), (COPY_TO_REGCLASS RFP64:$src, RFP80)>,
+            Requires<[FPStackf64]>;
+ 
+ // FP truncations map onto simple pseudo-value conversions if they are to/from
+ // the FP stack.  We have validated that only value-preserving truncations make
+ // it through isel.
+-def : Pat<(f32 (fround RFP64:$src)), (COPY_TO_REGCLASS RFP64:$src, RFP32)>,
++def : Pat<(f32 (fpround RFP64:$src)), (COPY_TO_REGCLASS RFP64:$src, RFP32)>,
+           Requires<[FPStackf32]>;
+-def : Pat<(f32 (fround RFP80:$src)), (COPY_TO_REGCLASS RFP80:$src, RFP32)>,
++def : Pat<(f32 (fpround RFP80:$src)), (COPY_TO_REGCLASS RFP80:$src, RFP32)>,
+            Requires<[FPStackf32]>;
+-def : Pat<(f64 (fround RFP80:$src)), (COPY_TO_REGCLASS RFP80:$src, RFP64)>,
++def : Pat<(f64 (fpround RFP80:$src)), (COPY_TO_REGCLASS RFP80:$src, RFP64)>,
+            Requires<[FPStackf64]>;
+Index: lib/Target/X86/X86InstrSSE.td
+===================================================================
+--- a/lib/Target/X86/X86InstrSSE.td
++++ b/lib/Target/X86/X86InstrSSE.td
+@@ -1799,16 +1799,16 @@
+                       Sched<[WriteCvtF2FLd, ReadAfterLd]>;
+ }
+ 
+-def : Pat<(f32 (fround FR64:$src)), (VCVTSD2SSrr FR64:$src, FR64:$src)>,
++def : Pat<(f32 (fpround FR64:$src)), (VCVTSD2SSrr FR64:$src, FR64:$src)>,
+           Requires<[UseAVX]>;
+ 
+ def CVTSD2SSrr  : SDI<0x5A, MRMSrcReg, (outs FR32:$dst), (ins FR64:$src),
+                       "cvtsd2ss\t{$src, $dst|$dst, $src}",
+-                      [(set FR32:$dst, (fround FR64:$src))],
++                      [(set FR32:$dst, (fpround FR64:$src))],
+                       IIC_SSE_CVT_Scalar_RR>, Sched<[WriteCvtF2F]>;
+ def CVTSD2SSrm  : I<0x5A, MRMSrcMem, (outs FR32:$dst), (ins f64mem:$src),
+                       "cvtsd2ss\t{$src, $dst|$dst, $src}",
+-                      [(set FR32:$dst, (fround (loadf64 addr:$src)))],
++                      [(set FR32:$dst, (fpround (loadf64 addr:$src)))],
+                       IIC_SSE_CVT_Scalar_RM>,
+                       XD,
+                   Requires<[UseSSE2, OptForSize]>, Sched<[WriteCvtF2FLd]>;
+@@ -1865,9 +1865,9 @@
+                     Sched<[WriteCvtF2FLd, ReadAfterLd]>;
+ }
+ 
+-def : Pat<(f64 (fextend FR32:$src)),
++def : Pat<(f64 (fpextend FR32:$src)),
+     (VCVTSS2SDrr FR32:$src, FR32:$src)>, Requires<[UseAVX]>;
+-def : Pat<(fextend (loadf32 addr:$src)),
++def : Pat<(fpextend (loadf32 addr:$src)),
+     (VCVTSS2SDrm (f32 (IMPLICIT_DEF)), addr:$src)>, Requires<[UseAVX]>;
+ 
+ def : Pat<(extloadf32 addr:$src),
+@@ -1879,21 +1879,21 @@
+ 
+ def CVTSS2SDrr : I<0x5A, MRMSrcReg, (outs FR64:$dst), (ins FR32:$src),
+                    "cvtss2sd\t{$src, $dst|$dst, $src}",
+-                   [(set FR64:$dst, (fextend FR32:$src))],
++                   [(set FR64:$dst, (fpextend FR32:$src))],
+                    IIC_SSE_CVT_Scalar_RR>, XS,
+                  Requires<[UseSSE2]>, Sched<[WriteCvtF2F]>;
+ def CVTSS2SDrm : I<0x5A, MRMSrcMem, (outs FR64:$dst), (ins f32mem:$src),
+                    "cvtss2sd\t{$src, $dst|$dst, $src}",
+                    [(set FR64:$dst, (extloadf32 addr:$src))],
+                    IIC_SSE_CVT_Scalar_RM>, XS,
+                  Requires<[UseSSE2, OptForSize]>, Sched<[WriteCvtF2FLd]>;
+ 
+-// extload f32 -> f64.  This matches load+fextend because we have a hack in
++// extload f32 -> f64.  This matches load+fpextend because we have a hack in
+ // the isel (PreprocessForFPConvert) that can introduce loads after dag
+ // combine.
+-// Since these loads aren't folded into the fextend, we have to match it
++// Since these loads aren't folded into the fpextend, we have to match it
+ // explicitly here.
+-def : Pat<(fextend (loadf32 addr:$src)),
++def : Pat<(fpextend (loadf32 addr:$src)),
+           (CVTSS2SDrm addr:$src)>, Requires<[UseSSE2]>;
+ def : Pat<(extloadf32 addr:$src),
+           (CVTSS2SDrr (MOVSSrm addr:$src))>, Requires<[UseSSE2, OptForSpeed]>;
+@@ -2269,26 +2269,26 @@
+ }
+ 
+ let Predicates = [HasAVX, NoVLX] in {
+-  // Match fround and fextend for 128/256-bit conversions
++  // Match fpround and fpextend for 128/256-bit conversions
+   def : Pat<(v4f32 (X86vfpround (v2f64 VR128:$src))),
+             (VCVTPD2PSrr VR128:$src)>;
+   def : Pat<(v4f32 (X86vfpround (loadv2f64 addr:$src))),
+             (VCVTPD2PSXrm addr:$src)>;
+-  def : Pat<(v4f32 (fround (v4f64 VR256:$src))),
++  def : Pat<(v4f32 (fpround (v4f64 VR256:$src))),
+             (VCVTPD2PSYrr VR256:$src)>;
+-  def : Pat<(v4f32 (fround (loadv4f64 addr:$src))),
++  def : Pat<(v4f32 (fpround (loadv4f64 addr:$src))),
+             (VCVTPD2PSYrm addr:$src)>;
+ 
+   def : Pat<(v2f64 (X86vfpext (v4f32 VR128:$src))),
+             (VCVTPS2PDrr VR128:$src)>;
+-  def : Pat<(v4f64 (fextend (v4f32 VR128:$src))),
++  def : Pat<(v4f64 (fpextend (v4f32 VR128:$src))),
+             (VCVTPS2PDYrr VR128:$src)>;
+   def : Pat<(v4f64 (extloadv4f32 addr:$src)),
+             (VCVTPS2PDYrm addr:$src)>;
+ }
+ 
+ let Predicates = [UseSSE2] in {
+-  // Match fround and fextend for 128 conversions
++  // Match fpround and fpextend for 128 conversions
+   def : Pat<(v4f32 (X86vfpround (v2f64 VR128:$src))),
+             (CVTPD2PSrr VR128:$src)>;
+   def : Pat<(v4f32 (X86vfpround (memopv2f64 addr:$src))),
diff --git a/llvm-D24300_ptx_intrinsics.patch b/llvm-D24300_ptx_intrinsics.patch
new file mode 100644
index 0000000..e0c1e5a
--- /dev/null
+++ b/llvm-D24300_ptx_intrinsics.patch
@@ -0,0 +1,506 @@
+Index: lib/Target/NVPTX/NVPTXISelLowering.cpp
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXISelLowering.cpp
++++ b/lib/Target/NVPTX/NVPTXISelLowering.cpp
+@@ -279,6 +279,28 @@
+   setTargetDAGCombine(ISD::SHL);
+   setTargetDAGCombine(ISD::SELECT);
+ 
++  // Library functions.  These default to Expand, but we have instructions
++  // for them.
++  setOperationAction(ISD::FCEIL,  MVT::f32, Legal);
++  setOperationAction(ISD::FCEIL,  MVT::f64, Legal);
++  setOperationAction(ISD::FFLOOR, MVT::f32, Legal);
++  setOperationAction(ISD::FFLOOR, MVT::f64, Legal);
++  setOperationAction(ISD::FNEARBYINT, MVT::f32, Legal);
++  setOperationAction(ISD::FNEARBYINT, MVT::f64, Legal);
++  setOperationAction(ISD::FRINT,  MVT::f32, Legal);
++  setOperationAction(ISD::FRINT,  MVT::f64, Legal);
++  setOperationAction(ISD::FROUND, MVT::f32, Legal);
++  setOperationAction(ISD::FROUND, MVT::f64, Legal);
++  setOperationAction(ISD::FTRUNC, MVT::f32, Legal);
++  setOperationAction(ISD::FTRUNC, MVT::f64, Legal);
++  setOperationAction(ISD::FMINNUM, MVT::f32, Legal);
++  setOperationAction(ISD::FMINNUM, MVT::f64, Legal);
++  setOperationAction(ISD::FMAXNUM, MVT::f32, Legal);
++  setOperationAction(ISD::FMAXNUM, MVT::f64, Legal);
++
++  // No FEXP2, FLOG2.  The PTX ex2 and log2 functions are always approximate.
++  // No FPOW or FREM in PTX.
++
+   // Now deduce the information based on the above mentioned
+   // actions
+   computeRegisterProperties(STI.getRegisterInfo());
+Index: lib/Target/NVPTX/NVPTXInstrInfo.td
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXInstrInfo.td
++++ b/lib/Target/NVPTX/NVPTXInstrInfo.td
+@@ -207,15 +207,63 @@
+ }
+ 
+ // Template for instructions which take three fp64 or fp32 args.  The
+-// instructions are named "<OpcStr>.f<Width>" (e.g. "add.f64").
++// instructions are named "<OpcStr>.f<Width>" (e.g. "min.f64").
+ //
+ // Also defines ftz (flush subnormal inputs and results to sign-preserving
+ // zero) variants for fp32 functions.
++//
++// This multiclass should be used for nodes that cannot be folded into FMAs.
++// For nodes that can be folded into FMAs (i.e. adds and muls), use
++// F3_fma_component.
+ multiclass F3<string OpcStr, SDNode OpNode> {
+    def f64rr :
+      NVPTXInst<(outs Float64Regs:$dst),
+                (ins Float64Regs:$a, Float64Regs:$b),
+                !strconcat(OpcStr, ".f64 \t$dst, $a, $b;"),
++               [(set Float64Regs:$dst, (OpNode Float64Regs:$a, Float64Regs:$b))]>;
++   def f64ri :
++     NVPTXInst<(outs Float64Regs:$dst),
++               (ins Float64Regs:$a, f64imm:$b),
++               !strconcat(OpcStr, ".f64 \t$dst, $a, $b;"),
++               [(set Float64Regs:$dst, (OpNode Float64Regs:$a, fpimm:$b))]>;
++   def f32rr_ftz :
++     NVPTXInst<(outs Float32Regs:$dst),
++               (ins Float32Regs:$a, Float32Regs:$b),
++               !strconcat(OpcStr, ".ftz.f32 \t$dst, $a, $b;"),
++               [(set Float32Regs:$dst, (OpNode Float32Regs:$a, Float32Regs:$b))]>,
++               Requires<[doF32FTZ]>;
++   def f32ri_ftz :
++     NVPTXInst<(outs Float32Regs:$dst),
++               (ins Float32Regs:$a, f32imm:$b),
++               !strconcat(OpcStr, ".ftz.f32 \t$dst, $a, $b;"),
++               [(set Float32Regs:$dst, (OpNode Float32Regs:$a, fpimm:$b))]>,
++               Requires<[doF32FTZ]>;
++   def f32rr :
++     NVPTXInst<(outs Float32Regs:$dst),
++               (ins Float32Regs:$a, Float32Regs:$b),
++               !strconcat(OpcStr, ".f32 \t$dst, $a, $b;"),
++               [(set Float32Regs:$dst, (OpNode Float32Regs:$a, Float32Regs:$b))]>;
++   def f32ri :
++     NVPTXInst<(outs Float32Regs:$dst),
++               (ins Float32Regs:$a, f32imm:$b),
++               !strconcat(OpcStr, ".f32 \t$dst, $a, $b;"),
++               [(set Float32Regs:$dst, (OpNode Float32Regs:$a, fpimm:$b))]>;
++}
++
++// Template for instructions which take three fp64 or fp32 args.  The
++// instructions are named "<OpcStr>.f<Width>" (e.g. "add.f64").
++//
++// Also defines ftz (flush subnormal inputs and results to sign-preserving
++// zero) variants for fp32 functions.
++//
++// This multiclass should be used for nodes that can be folded to make fma ops.
++// In this case, we use the ".rn" variant when FMA is disabled, as this behaves
++// just like the non ".rn" op, but prevents ptxas from creating FMAs.
++multiclass F3_fma_component<string OpcStr, SDNode OpNode> {
++   def f64rr :
++     NVPTXInst<(outs Float64Regs:$dst),
++               (ins Float64Regs:$a, Float64Regs:$b),
++               !strconcat(OpcStr, ".f64 \t$dst, $a, $b;"),
+                [(set Float64Regs:$dst, (OpNode Float64Regs:$a, Float64Regs:$b))]>,
+                Requires<[allowFMA]>;
+    def f64ri :
+@@ -248,41 +296,39 @@
+                !strconcat(OpcStr, ".f32 \t$dst, $a, $b;"),
+                [(set Float32Regs:$dst, (OpNode Float32Regs:$a, fpimm:$b))]>,
+                Requires<[allowFMA]>;
+-}
+ 
+-// Same as F3, but defines ".rn" variants (round to nearest even).
+-multiclass F3_rn<string OpcStr, SDNode OpNode> {
+-   def f64rr :
++   // These have strange names so we don't perturb existing mir tests.
++   def _rnf64rr :
+      NVPTXInst<(outs Float64Regs:$dst),
+                (ins Float64Regs:$a, Float64Regs:$b),
+                !strconcat(OpcStr, ".rn.f64 \t$dst, $a, $b;"),
+                [(set Float64Regs:$dst, (OpNode Float64Regs:$a, Float64Regs:$b))]>,
+                Requires<[noFMA]>;
+-   def f64ri :
++   def _rnf64ri :
+      NVPTXInst<(outs Float64Regs:$dst),
+                (ins Float64Regs:$a, f64imm:$b),
+                !strconcat(OpcStr, ".rn.f64 \t$dst, $a, $b;"),
+                [(set Float64Regs:$dst, (OpNode Float64Regs:$a, fpimm:$b))]>,
+                Requires<[noFMA]>;
+-   def f32rr_ftz :
++   def _rnf32rr_ftz :
+      NVPTXInst<(outs Float32Regs:$dst),
+                (ins Float32Regs:$a, Float32Regs:$b),
+                !strconcat(OpcStr, ".rn.ftz.f32 \t$dst, $a, $b;"),
+                [(set Float32Regs:$dst, (OpNode Float32Regs:$a, Float32Regs:$b))]>,
+                Requires<[noFMA, doF32FTZ]>;
+-   def f32ri_ftz :
++   def _rnf32ri_ftz :
+      NVPTXInst<(outs Float32Regs:$dst),
+                (ins Float32Regs:$a, f32imm:$b),
+                !strconcat(OpcStr, ".rn.ftz.f32 \t$dst, $a, $b;"),
+                [(set Float32Regs:$dst, (OpNode Float32Regs:$a, fpimm:$b))]>,
+                Requires<[noFMA, doF32FTZ]>;
+-   def f32rr :
++   def _rnf32rr :
+      NVPTXInst<(outs Float32Regs:$dst),
+                (ins Float32Regs:$a, Float32Regs:$b),
+                !strconcat(OpcStr, ".rn.f32 \t$dst, $a, $b;"),
+                [(set Float32Regs:$dst, (OpNode Float32Regs:$a, Float32Regs:$b))]>,
+                Requires<[noFMA]>;
+-   def f32ri :
++   def _rnf32ri :
+      NVPTXInst<(outs Float32Regs:$dst),
+                (ins Float32Regs:$a, f32imm:$b),
+                !strconcat(OpcStr, ".rn.f32 \t$dst, $a, $b;"),
+@@ -713,13 +759,12 @@
+          N->getValueAPF().convertToDouble() == 1.0;
+ }]>;
+ 
+-defm FADD : F3<"add", fadd>;
+-defm FSUB : F3<"sub", fsub>;
+-defm FMUL : F3<"mul", fmul>;
+-
+-defm FADD_rn : F3_rn<"add", fadd>;
+-defm FSUB_rn : F3_rn<"sub", fsub>;
+-defm FMUL_rn : F3_rn<"mul", fmul>;
++defm FADD : F3_fma_component<"add", fadd>;
++defm FSUB : F3_fma_component<"sub", fsub>;
++defm FMUL : F3_fma_component<"mul", fmul>;
++
++defm FMIN : F3<"min", fminnum>;
++defm FMAX : F3<"max", fmaxnum>;
+ 
+ defm FABS  : F2<"abs", fabs>;
+ defm FNEG  : F2<"neg", fneg>;
+@@ -2628,6 +2673,55 @@
+ def retflag : SDNode<"NVPTXISD::RET_FLAG", SDTNone,
+                      [SDNPHasChain, SDNPOptInGlue]>;
+ 
++// fceil, ffloor, fround, ftrunc.
++
++def : Pat<(fceil Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRPI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(fceil Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRPI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(fceil Float64Regs:$a),
++          (CVT_f64_f64 Float64Regs:$a, CvtRPI)>;
++
++def : Pat<(ffloor Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRMI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(ffloor Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRMI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(ffloor Float64Regs:$a),
++          (CVT_f64_f64 Float64Regs:$a, CvtRMI)>;
++
++def : Pat<(fround Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(f32 (fround Float32Regs:$a)),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(f64 (fround Float64Regs:$a)),
++          (CVT_f64_f64 Float64Regs:$a, CvtRNI)>;
++
++def : Pat<(ftrunc Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRZI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(ftrunc Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRZI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(ftrunc Float64Regs:$a),
++          (CVT_f64_f64 Float64Regs:$a, CvtRZI)>;
++
++// nearbyint and rint are implemented as rounding to nearest even.  This isn't
++// strictly correct, because it causes us to ignore the rounding mode.  But it
++// matches what CUDA's "libm" does.
++
++def : Pat<(fnearbyint Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(fnearbyint Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(fnearbyint Float64Regs:$a),
++          (CVT_f64_f64 Float64Regs:$a, CvtRNI)>;
++
++def : Pat<(frint Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI_FTZ)>, Requires<[doF32FTZ]>;
++def : Pat<(frint Float32Regs:$a),
++          (CVT_f32_f32 Float32Regs:$a, CvtRNI)>, Requires<[doNoF32FTZ]>;
++def : Pat<(frint Float64Regs:$a),
++          (CVT_f64_f64 Float64Regs:$a, CvtRNI)>;
++
++
+ //-----------------------------------
+ // Control-flow
+ //-----------------------------------
+Index: test/CodeGen/NVPTX/bug22322.ll
+===================================================================
+--- a/test/CodeGen/NVPTX/bug22322.ll
++++ b/test/CodeGen/NVPTX/bug22322.ll
+@@ -22,7 +22,7 @@
+   %8 = icmp eq i32 %7, 0
+   %9 = select i1 %8, float 0.000000e+00, float -1.000000e+00
+   store float %9, float* %ret_vec.sroa.8.i, align 4
+-; CHECK: setp.lt.f32     %p{{[0-9]+}}, %f{{[0-9]+}}, 0f00000000
++; CHECK: max.f32 %f{{[0-9]+}}, %f{{[0-9]+}}, 0f00000000
+   %10 = fcmp olt float %9, 0.000000e+00
+   %ret_vec.sroa.8.i.val = load float, float* %ret_vec.sroa.8.i, align 4
+   %11 = select i1 %10, float 0.000000e+00, float %ret_vec.sroa.8.i.val
+Index: test/CodeGen/NVPTX/math-intrins.ll
+===================================================================
+--- a/test/CodeGen/NVPTX/math-intrins.ll
++++ b/test/CodeGen/NVPTX/math-intrins.ll
+@@ -0,0 +1,261 @@
++; RUN: llc < %s | FileCheck %s
++target triple = "nvptx64-nvidia-cuda"
++
++; Checks that llvm intrinsics for math functions are correctly lowered to PTX.
++
++declare float @llvm.ceil.f32(float) #0
++declare double @llvm.ceil.f64(double) #0
++declare float @llvm.floor.f32(float) #0
++declare double @llvm.floor.f64(double) #0
++declare float @llvm.round.f32(float) #0
++declare double @llvm.round.f64(double) #0
++declare float @llvm.nearbyint.f32(float) #0
++declare double @llvm.nearbyint.f64(double) #0
++declare float @llvm.rint.f32(float) #0
++declare double @llvm.rint.f64(double) #0
++declare float @llvm.trunc.f32(float) #0
++declare double @llvm.trunc.f64(double) #0
++declare float @llvm.fabs.f32(float) #0
++declare double @llvm.fabs.f64(double) #0
++declare float @llvm.minnum.f32(float, float) #0
++declare double @llvm.minnum.f64(double, double) #0
++declare float @llvm.maxnum.f32(float, float) #0
++declare double @llvm.maxnum.f64(double, double) #0
++
++; ---- ceil ----
++
++; CHECK-LABEL: ceil_float
++define float @ceil_float(float %a) {
++  ; CHECK: cvt.rpi.f32.f32
++  %b = call float @llvm.ceil.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: ceil_float_ftz
++define float @ceil_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rpi.ftz.f32.f32
++  %b = call float @llvm.ceil.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: ceil_double
++define double @ceil_double(double %a) {
++  ; CHECK: cvt.rpi.f64.f64
++  %b = call double @llvm.ceil.f64(double %a)
++  ret double %b
++}
++
++; ---- floor ----
++
++; CHECK-LABEL: floor_float
++define float @floor_float(float %a) {
++  ; CHECK: cvt.rmi.f32.f32
++  %b = call float @llvm.floor.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: floor_float_ftz
++define float @floor_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rmi.ftz.f32.f32
++  %b = call float @llvm.floor.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: floor_double
++define double @floor_double(double %a) {
++  ; CHECK: cvt.rmi.f64.f64
++  %b = call double @llvm.floor.f64(double %a)
++  ret double %b
++}
++
++; ---- round ----
++
++; CHECK-LABEL: round_float
++define float @round_float(float %a) {
++  ; CHECK: cvt.rni.f32.f32
++  %b = call float @llvm.round.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: round_float_ftz
++define float @round_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rni.ftz.f32.f32
++  %b = call float @llvm.round.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: round_double
++define double @round_double(double %a) {
++  ; CHECK: cvt.rni.f64.f64
++  %b = call double @llvm.round.f64(double %a)
++  ret double %b
++}
++
++; ---- nearbyint ----
++
++; CHECK-LABEL: nearbyint_float
++define float @nearbyint_float(float %a) {
++  ; CHECK: cvt.rni.f32.f32
++  %b = call float @llvm.nearbyint.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: nearbyint_float_ftz
++define float @nearbyint_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rni.ftz.f32.f32
++  %b = call float @llvm.nearbyint.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: nearbyint_double
++define double @nearbyint_double(double %a) {
++  ; CHECK: cvt.rni.f64.f64
++  %b = call double @llvm.nearbyint.f64(double %a)
++  ret double %b
++}
++
++; ---- rint ----
++
++; CHECK-LABEL: rint_float
++define float @rint_float(float %a) {
++  ; CHECK: cvt.rni.f32.f32
++  %b = call float @llvm.rint.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: rint_float_ftz
++define float @rint_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rni.ftz.f32.f32
++  %b = call float @llvm.rint.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: rint_double
++define double @rint_double(double %a) {
++  ; CHECK: cvt.rni.f64.f64
++  %b = call double @llvm.rint.f64(double %a)
++  ret double %b
++}
++
++; ---- trunc ----
++
++; CHECK-LABEL: trunc_float
++define float @trunc_float(float %a) {
++  ; CHECK: cvt.rzi.f32.f32
++  %b = call float @llvm.trunc.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: trunc_float_ftz
++define float @trunc_float_ftz(float %a) #1 {
++  ; CHECK: cvt.rzi.ftz.f32.f32
++  %b = call float @llvm.trunc.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: trunc_double
++define double @trunc_double(double %a) {
++  ; CHECK: cvt.rzi.f64.f64
++  %b = call double @llvm.trunc.f64(double %a)
++  ret double %b
++}
++
++; ---- abs ----
++
++; CHECK-LABEL: abs_float
++define float @abs_float(float %a) {
++  ; CHECK: abs.f32
++  %b = call float @llvm.fabs.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: abs_float_ftz
++define float @abs_float_ftz(float %a) #1 {
++  ; CHECK: abs.ftz.f32
++  %b = call float @llvm.fabs.f32(float %a)
++  ret float %b
++}
++
++; CHECK-LABEL: abs_double
++define double @abs_double(double %a) {
++  ; CHECK: abs.f64
++  %b = call double @llvm.fabs.f64(double %a)
++  ret double %b
++}
++
++; ---- min ----
++
++; CHECK-LABEL: min_float
++define float @min_float(float %a, float %b) {
++  ; CHECK: min.f32
++  %x = call float @llvm.minnum.f32(float %a, float %b)
++  ret float %x
++}
++
++; CHECK-LABEL: min_imm1
++define float @min_imm1(float %a) {
++  ; CHECK: min.f32
++  %x = call float @llvm.minnum.f32(float %a, float 0.0)
++  ret float %x
++}
++
++; CHECK-LABEL: min_imm2
++define float @min_imm2(float %a) {
++  ; CHECK: min.f32
++  %x = call float @llvm.minnum.f32(float 0.0, float %a)
++  ret float %x
++}
++
++; CHECK-LABEL: min_float_ftz
++define float @min_float_ftz(float %a, float %b) #1 {
++  ; CHECK: min.ftz.f32
++  %x = call float @llvm.minnum.f32(float %a, float %b)
++  ret float %x
++}
++
++; CHECK-LABEL: min_double
++define double @min_double(double %a, double %b) {
++  ; CHECK: min.f64
++  %x = call double @llvm.minnum.f64(double %a, double %b)
++  ret double %x
++}
++
++; ---- max ----
++
++; CHECK-LABEL: max_imm1
++define float @max_imm1(float %a) {
++  ; CHECK: max.f32
++  %x = call float @llvm.maxnum.f32(float %a, float 0.0)
++  ret float %x
++}
++
++; CHECK-LABEL: max_imm2
++define float @max_imm2(float %a) {
++  ; CHECK: max.f32
++  %x = call float @llvm.maxnum.f32(float 0.0, float %a)
++  ret float %x
++}
++
++; CHECK-LABEL: max_float
++define float @max_float(float %a, float %b) {
++  ; CHECK: max.f32
++  %x = call float @llvm.maxnum.f32(float %a, float %b)
++  ret float %x
++}
++
++; CHECK-LABEL: max_float_ftz
++define float @max_float_ftz(float %a, float %b) #1 {
++  ; CHECK: max.ftz.f32
++  %x = call float @llvm.maxnum.f32(float %a, float %b)
++  ret float %x
++}
++
++; CHECK-LABEL: max_double
++define double @max_double(double %a, double %b) {
++  ; CHECK: max.f64
++  %x = call double @llvm.maxnum.f64(double %a, double %b)
++  ret double %x
++}
++
++attributes #0 = { nounwind readnone }
++attributes #1 = { "nvptx-f32ftz" = "true" }
diff --git a/llvm-D25865-cmakeshlib.patch b/llvm-D25865-cmakeshlib.patch
new file mode 100644
index 0000000..1f98266
--- /dev/null
+++ b/llvm-D25865-cmakeshlib.patch
@@ -0,0 +1,83 @@
+From 417001588d232151050db2d32df443e2d073ebbf Mon Sep 17 00:00:00 2001
+From: Valentin Churavy <v.churavy@gmail.com>
+Date: Fri, 21 Oct 2016 17:25:04 +0900
+Subject: [PATCH] Fix llvm-shlib cmake build
+
+Summary:
+This fixes a few things that used to work with a Makefile build, but were broken in cmake.
+
+1. Treat MINGW like a Linux system.
+2. The shlib should never contain other shared libraries.
+
+Subscribers: beanz, mgorny
+
+Differential Revision: https://reviews.llvm.org/D25865
+---
+ tools/llvm-shlib/CMakeLists.txt | 42 ++++++++++++++++++++---------------------
+ 1 file changed, 20 insertions(+), 22 deletions(-)
+
+diff --git a/tools/llvm-shlib/CMakeLists.txt b/tools/llvm-shlib/CMakeLists.txt
+index 3fe672d..edadb82 100644
+--- a/tools/llvm-shlib/CMakeLists.txt
++++ b/tools/llvm-shlib/CMakeLists.txt
+@@ -8,29 +8,27 @@ set(SOURCES
+ 
+ llvm_map_components_to_libnames(LIB_NAMES ${LLVM_DYLIB_COMPONENTS})
+ 
+-if(LLVM_LINK_LLVM_DYLIB)
+-  if(LLVM_DYLIB_EXPORTED_SYMBOL_FILE)
+-    message(WARNING "Using LLVM_LINK_LLVM_DYLIB with LLVM_DYLIB_EXPORTED_SYMBOL_FILE may not work. Use at your own risk.")
+-  endif()
+-
+-  # libLLVM.so should not have any dependencies on any other LLVM
+-  # shared libraries. When using the "all" pseudo-component,
+-  # LLVM_AVAILABLE_LIBS is added to the dependencies, which may
+-  # contain shared libraries (e.g. libLTO).
+-  #
+-  # Also exclude libLLVMTableGen for the following reasons:
+-  #  - it is only used by internal *-tblgen utilities;
+-  #  - it pollutes the global options space.
+-  foreach(lib ${LIB_NAMES})
+-    get_target_property(t ${lib} TYPE)
+-    if("${lib}" STREQUAL "LLVMTableGen")
+-    elseif("x${t}" STREQUAL "xSTATIC_LIBRARY")
+-      list(APPEND FILTERED_LIB_NAMES ${lib})
+-    endif()
+-  endforeach()
+-  set(LIB_NAMES ${FILTERED_LIB_NAMES})
++if(LLVM_LINK_LLVM_DYLIB AND LLVM_DYLIB_EXPORTED_SYMBOL_FILE)
++  message(WARNING "Using LLVM_LINK_LLVM_DYLIB with LLVM_DYLIB_EXPORTED_SYMBOL_FILE may not work. Use at your own risk.")
+ endif()
+ 
++# libLLVM.so should not have any dependencies on any other LLVM
++# shared libraries. When using the "all" pseudo-component,
++# LLVM_AVAILABLE_LIBS is added to the dependencies, which may
++# contain shared libraries (e.g. libLTO).
++#
++# Also exclude libLLVMTableGen for the following reasons:
++#  - it is only used by internal *-tblgen utilities;
++#  - it pollutes the global options space.
++foreach(lib ${LIB_NAMES})
++  get_target_property(t ${lib} TYPE)
++  if("${lib}" STREQUAL "LLVMTableGen")
++  elseif("x${t}" STREQUAL "xSTATIC_LIBRARY")
++    list(APPEND FILTERED_LIB_NAMES ${lib})
++  endif()
++endforeach()
++set(LIB_NAMES ${FILTERED_LIB_NAMES})
++
+ if(LLVM_DYLIB_EXPORTED_SYMBOL_FILE)
+   set(LLVM_EXPORTED_SYMBOL_FILE ${LLVM_DYLIB_EXPORTED_SYMBOL_FILE})
+   add_custom_target(libLLVMExports DEPENDS ${LLVM_EXPORTED_SYMBOL_FILE})
+@@ -39,7 +37,7 @@ endif()
+ add_llvm_library(LLVM SHARED DISABLE_LLVM_LINK_LLVM_DYLIB SONAME ${SOURCES})
+ 
+ list(REMOVE_DUPLICATES LIB_NAMES)
+-if("${CMAKE_SYSTEM_NAME}" STREQUAL "Linux") # FIXME: It should be "GNU ld for elf"
++if("${CMAKE_SYSTEM_NAME}" STREQUAL "Linux" OR MINGW) # FIXME: It should be "GNU ld for elf"
+   # GNU ld doesn't resolve symbols in the version script.
+   set(LIB_NAMES -Wl,--whole-archive ${LIB_NAMES} -Wl,--no-whole-archive)
+ elseif("${CMAKE_SYSTEM_NAME}" STREQUAL "Darwin")
+-- 
+2.10.1
+
diff --git a/llvm-D27389.patch b/llvm-D27389.patch
new file mode 100644
index 0000000..6ddc6e7
--- /dev/null
+++ b/llvm-D27389.patch
@@ -0,0 +1,66 @@
+commit 83dc06334ff95ad18a951d0bb540290510f2f81a
+Author: Keno Fischer <kfischer@college.harvard.edu>
+Date:   Thu Dec 8 17:22:35 2016 +0000
+
+    ConstantFolding: Don't crash when encountering vector GEP
+    
+    ConstantFolding tried to cast one of the scalar indices to a vector
+    type. Instead, use the vector type only for the first index (which
+    is the only one allowed to be a vector) and use its scalar type
+    otherwise.
+    
+    Fixes PR31250.
+    
+    Reviewers: majnemer
+    Differential Revision: https://reviews.llvm.org/D27389
+    
+    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289073 91177308-0d34-0410-b5e6-96231b3b80d8
+
+diff --git a/lib/Analysis/ConstantFolding.cpp b/lib/Analysis/ConstantFolding.cpp
+index 2d1edfe..1c0bf01a 100644
+--- a/lib/Analysis/ConstantFolding.cpp
++++ b/lib/Analysis/ConstantFolding.cpp
+@@ -692,14 +692,15 @@ Constant *CastGEPIndices(Type *SrcElemTy, ArrayRef<Constant *> Ops,
+                          Type *ResultTy, const DataLayout &DL,
+                          const TargetLibraryInfo *TLI) {
+   Type *IntPtrTy = DL.getIntPtrType(ResultTy);
++  Type *IntPtrScalarTy = IntPtrTy->getScalarType();
+ 
+   bool Any = false;
+   SmallVector<Constant*, 32> NewIdxs;
+   for (unsigned i = 1, e = Ops.size(); i != e; ++i) {
+     if ((i == 1 ||
+-         !isa<StructType>(GetElementPtrInst::getIndexedType(SrcElemTy,
+-             Ops.slice(1, i - 1)))) &&
+-        Ops[i]->getType() != IntPtrTy) {
++         !isa<StructType>(GetElementPtrInst::getIndexedType(
++             SrcElemTy, Ops.slice(1, i - 1)))) &&
++        Ops[i]->getType() != (i == 1 ? IntPtrTy : IntPtrScalarTy)) {
+       Any = true;
+       NewIdxs.push_back(ConstantExpr::getCast(CastInst::getCastOpcode(Ops[i],
+                                                                       true,
+diff --git a/test/Analysis/ConstantFolding/vectorgep-crash.ll b/test/Analysis/ConstantFolding/vectorgep-crash.ll
+new file mode 100644
+index 0000000..bcc96b2
+--- /dev/null
++++ b/test/Analysis/ConstantFolding/vectorgep-crash.ll
+@@ -0,0 +1,19 @@
++; RUN: opt -instcombine -S -o - %s | FileCheck %s
++; Tests that we don't crash upon encountering a vector GEP
++
++target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
++target triple = "x86_64-unknown-linux-gnu"
++
++%Dual = type { %Dual.72, %Partials.73 }
++%Dual.72 = type { double, %Partials }
++%Partials = type { [2 x double] }
++%Partials.73 = type { [2 x %Dual.72] }
++
++; Function Attrs: sspreq
++define <8 x i64*> @"julia_axpy!_65480"(%Dual* %arg1, <8 x i64> %arg2) {
++top:
++; CHECK: %VectorGep14 = getelementptr inbounds %Dual, %Dual* %arg1, <8 x i64> %arg2, i32 1, i32 0, i64 0, i32 1, i32 0, i64 0
++  %VectorGep14 = getelementptr inbounds %Dual, %Dual* %arg1, <8 x i64> %arg2, i32 1, i32 0, i64 0, i32 1, i32 0, i64 0
++  %0 = bitcast <8 x double*> %VectorGep14 to <8 x i64*>
++  ret <8 x i64*> %0
++}
diff --git a/llvm-D27397.patch b/llvm-D27397.patch
new file mode 100644
index 0000000..fd18b3f
--- /dev/null
+++ b/llvm-D27397.patch
@@ -0,0 +1,101 @@
+commit 99ca52276f9ee1386866d6dff6179cfa64824621
+Author: Keno Fischer <kfischer@college.harvard.edu>
+Date:   Mon Dec 5 21:25:03 2016 +0000
+
+    [LAA] Prevent invalid IR for loop-invariant bound in loop body
+    
+    Summary:
+    If LAA expands a bound that is loop invariant, but not hoisted out
+    of the loop body, it used to use that value anyway, causing a
+    non-domination error, because the memcheck block is of course not
+    dominated by the scalar loop body. Detect this situation and expand
+    the SCEV expression instead.
+    
+    Fixes PR31251
+    
+    Reviewers: anemet
+    Subscribers: mzolotukhin, llvm-commits
+    
+    Differential Revision: https://reviews.llvm.org/D27397
+    
+    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288705 91177308-0d34-0410-b5e6-96231b3b80d8
+
+diff --git a/lib/Analysis/LoopAccessAnalysis.cpp b/lib/Analysis/LoopAccessAnalysis.cpp
+index 01a2f46..2f3dca3 100644
+--- a/lib/Analysis/LoopAccessAnalysis.cpp
++++ b/lib/Analysis/LoopAccessAnalysis.cpp
+@@ -1870,18 +1870,24 @@ expandBounds(const RuntimePointerChecking::CheckingPtrGroup *CG, Loop *TheLoop,
+   Value *Ptr = PtrRtChecking.Pointers[CG->Members[0]].PointerValue;
+   const SCEV *Sc = SE->getSCEV(Ptr);
+ 
++  unsigned AS = Ptr->getType()->getPointerAddressSpace();
++  LLVMContext &Ctx = Loc->getContext();
++
++  // Use this type for pointer arithmetic.
++  Type *PtrArithTy = Type::getInt8PtrTy(Ctx, AS);
++
+   if (SE->isLoopInvariant(Sc, TheLoop)) {
+     DEBUG(dbgs() << "LAA: Adding RT check for a loop invariant ptr:" << *Ptr
+                  << "\n");
+-    return {Ptr, Ptr};
++    // Ptr could be in the loop body. If so, expand a new one at the correct
++    // location.
++    Instruction *Inst = dyn_cast<Instruction>(Ptr);
++    Value *NewPtr = (Inst && TheLoop->contains(Inst))
++                        ? Exp.expandCodeFor(Sc, PtrArithTy, Loc)
++                        : Ptr;
++    return {NewPtr, NewPtr};
+   } else {
+-    unsigned AS = Ptr->getType()->getPointerAddressSpace();
+-    LLVMContext &Ctx = Loc->getContext();
+-
+-    // Use this type for pointer arithmetic.
+-    Type *PtrArithTy = Type::getInt8PtrTy(Ctx, AS);
+     Value *Start = nullptr, *End = nullptr;
+-
+     DEBUG(dbgs() << "LAA: Adding RT check for range:\n");
+     Start = Exp.expandCodeFor(CG->Low, PtrArithTy, Loc);
+     End = Exp.expandCodeFor(CG->High, PtrArithTy, Loc);
+diff --git a/test/Transforms/LoopVersioning/loop-invariant-bound.ll b/test/Transforms/LoopVersioning/loop-invariant-bound.ll
+new file mode 100644
+index 0000000..3411adb
+--- /dev/null
++++ b/test/Transforms/LoopVersioning/loop-invariant-bound.ll
+@@ -0,0 +1,37 @@
++; RUN: opt -loop-versioning -S < %s | FileCheck %s
++; Checks that when introducing check, we don't accidentally introduce non-dominating instructions
++target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
++
++%Dual.212 = type { %Dual.213, %Partials.215 }
++%Dual.213 = type { double, %Partials.214 }
++%Partials.214 = type { [2 x double] }
++%Partials.215 = type { [2 x %Dual.213] }
++
++; Function Attrs: sspreq
++define void @"julia_axpy!_65480"(%Dual.212*) {
++top:
++  br label %if24
++
++; CHECK-NOT: %bc = bitcast i64* %v2.sroa.0.0..sroa_cast
++; CHECK: %bound0
++
++if24:                                             ; preds = %if24, %top
++  %"#temp#1.sroa.3.02" = phi i64 [ undef, %top ], [ %2, %if24 ]
++  %"#temp#1.sroa.0.01" = phi i64 [ undef, %top ], [ %1, %if24 ]
++  %1 = add i64 %"#temp#1.sroa.0.01", 1
++  %2 = add i64 %"#temp#1.sroa.3.02", 1
++  ; This pointer is loop invariant. LAA used to re-use it from memcheck, even though it didn't dominate.
++  %v2.sroa.0.0..sroa_cast = bitcast %Dual.212* %0 to i64*
++  %v2.sroa.0.0.copyload = load i64, i64* %v2.sroa.0.0..sroa_cast, align 1
++  %3 = add i64 %"#temp#1.sroa.0.01", -1
++  %4 = getelementptr inbounds %Dual.212, %Dual.212* undef, i64 %3, i32 1, i32 0, i64 0, i32 1, i32 0, i64 0
++  %5 = bitcast double* %4 to i64*
++  store i64 undef, i64* %5, align 8
++  %notlhs27 = icmp eq i64 %2, undef
++  %notrhs28 = icmp eq i64 %1, undef
++  %6 = or i1 %notrhs28, %notlhs27
++  br i1 %6, label %L41.L335_crit_edge, label %if24
++
++L41.L335_crit_edge:                               ; preds = %if24
++  ret void
++}
diff --git a/llvm-D27609-AArch64-UABS_G3.patch b/llvm-D27609-AArch64-UABS_G3.patch
new file mode 100644
index 0000000..ba4a1b7
--- /dev/null
+++ b/llvm-D27609-AArch64-UABS_G3.patch
@@ -0,0 +1,311 @@
+From df0ce05530fd3a0e4c283af817f4446d439647ea Mon Sep 17 00:00:00 2001
+From: yuyichao <yuyichao@91177308-0d34-0410-b5e6-96231b3b80d8>
+Date: Thu, 15 Dec 2016 22:36:53 +0000
+Subject: [PATCH 1/2] Fix R_AARCH64_MOVW_UABS_G3 relocation
+
+Summary: The relocation is missing mask so an address that has non-zero bits in 47:43 may overwrite the register number. (Frequently shows up as target register changed to `xzr`....)
+
+Reviewers: t.p.northover, lhames
+
+Subscribers: davide, aemerson, rengolin, llvm-commits
+
+Differential Revision: https://reviews.llvm.org/D27609
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289880 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ include/llvm/Object/ELFObjectFile.h                |  2 +-
+ include/llvm/Object/RelocVisitor.h                 |  1 +
+ lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp | 72 +++++++++++++++-------
+ .../RuntimeDyld/AArch64/ELF_ARM64_BE-relocations.s | 34 ++++++++++
+ .../RuntimeDyld/AArch64/ELF_ARM64_relocations.s    | 33 ++++++++++
+ 5 files changed, 118 insertions(+), 24 deletions(-)
+ create mode 100644 test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-relocations.s
+ create mode 100644 test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_relocations.s
+
+diff --git a/include/llvm/Object/ELFObjectFile.h b/include/llvm/Object/ELFObjectFile.h
+index 4209da8..69987d4 100644
+--- a/include/llvm/Object/ELFObjectFile.h
++++ b/include/llvm/Object/ELFObjectFile.h
+@@ -972,7 +972,7 @@ unsigned ELFObjectFile<ELFT>::getArch() const {
+   case ELF::EM_X86_64:
+     return Triple::x86_64;
+   case ELF::EM_AARCH64:
+-    return Triple::aarch64;
++    return IsLittleEndian ? Triple::aarch64 : Triple::aarch64_be;
+   case ELF::EM_ARM:
+     return Triple::arm;
+   case ELF::EM_AVR:
+diff --git a/include/llvm/Object/RelocVisitor.h b/include/llvm/Object/RelocVisitor.h
+index e1926aa..3510d29 100644
+--- a/include/llvm/Object/RelocVisitor.h
++++ b/include/llvm/Object/RelocVisitor.h
+@@ -86,6 +86,7 @@ private:
+           return RelocToApply();
+         }
+       case Triple::aarch64:
++      case Triple::aarch64_be:
+         switch (RelocType) {
+         case llvm::ELF::R_AARCH64_ABS32:
+           return visitELF_AARCH64_ABS32(R, Value);
+diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+index c70e81a..a977dce 100644
+--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
++++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+@@ -325,6 +325,8 @@ void RuntimeDyldELF::resolveAArch64Relocation(const SectionEntry &Section,
+   uint32_t *TargetPtr =
+       reinterpret_cast<uint32_t *>(Section.getAddressWithOffset(Offset));
+   uint64_t FinalAddress = Section.getLoadAddressWithOffset(Offset);
++  // Data should use target endian. Code should always use little endian.
++  bool isBE = Arch == Triple::aarch64_be;
+ 
+   DEBUG(dbgs() << "resolveAArch64Relocation, LocalAddress: 0x"
+                << format("%llx", Section.getAddressWithOffset(Offset))
+@@ -340,14 +342,22 @@ void RuntimeDyldELF::resolveAArch64Relocation(const SectionEntry &Section,
+   case ELF::R_AARCH64_ABS64: {
+     uint64_t *TargetPtr =
+         reinterpret_cast<uint64_t *>(Section.getAddressWithOffset(Offset));
+-    *TargetPtr = Value + Addend;
++    if (isBE)
++      support::ubig64_t::ref{TargetPtr} = Value + Addend;
++    else
++      support::ulittle64_t::ref{TargetPtr} = Value + Addend;
+     break;
+   }
+   case ELF::R_AARCH64_PREL32: {
+     uint64_t Result = Value + Addend - FinalAddress;
+     assert(static_cast<int64_t>(Result) >= INT32_MIN &&
+            static_cast<int64_t>(Result) <= UINT32_MAX);
+-    *TargetPtr = static_cast<uint32_t>(Result & 0xffffffffU);
++    if (isBE)
++      support::ubig32_t::ref{TargetPtr} =
++        static_cast<uint32_t>(Result & 0xffffffffU);
++    else
++      support::ulittle32_t::ref{TargetPtr} =
++        static_cast<uint32_t>(Result & 0xffffffffU);
+     break;
+   }
+   case ELF::R_AARCH64_CALL26: // fallthrough
+@@ -355,104 +365,120 @@ void RuntimeDyldELF::resolveAArch64Relocation(const SectionEntry &Section,
+     // Operation: S+A-P. Set Call or B immediate value to bits fff_fffc of the
+     // calculation.
+     uint64_t BranchImm = Value + Addend - FinalAddress;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // "Check that -2^27 <= result < 2^27".
+     assert(isInt<28>(BranchImm));
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xfc000000U;
++    TargetValue &= 0xfc000000U;
+     // Immediate goes in bits 25:0 of B and BL.
+-    *TargetPtr |= static_cast<uint32_t>(BranchImm & 0xffffffcU) >> 2;
++    TargetValue |= static_cast<uint32_t>(BranchImm & 0xffffffcU) >> 2;
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_MOVW_UABS_G3: {
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffe0001fU;
++    TargetValue &= 0xffe0001fU;
+     // Immediate goes in bits 20:5 of MOVZ/MOVK instruction
+-    *TargetPtr |= Result >> (48 - 5);
++    TargetValue |= ((Result & 0xffff000000000000ULL) >> (48 - 5));
+     // Shift must be "lsl #48", in bits 22:21
+-    assert((*TargetPtr >> 21 & 0x3) == 3 && "invalid shift for relocation");
++    assert((TargetValue >> 21 & 0x3) == 3 && "invalid shift for relocation");
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_MOVW_UABS_G2_NC: {
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffe0001fU;
++    TargetValue &= 0xffe0001fU;
+     // Immediate goes in bits 20:5 of MOVZ/MOVK instruction
+-    *TargetPtr |= ((Result & 0xffff00000000ULL) >> (32 - 5));
++    TargetValue |= ((Result & 0xffff00000000ULL) >> (32 - 5));
+     // Shift must be "lsl #32", in bits 22:21
+-    assert((*TargetPtr >> 21 & 0x3) == 2 && "invalid shift for relocation");
++    assert((TargetValue >> 21 & 0x3) == 2 && "invalid shift for relocation");
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_MOVW_UABS_G1_NC: {
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffe0001fU;
++    TargetValue &= 0xffe0001fU;
+     // Immediate goes in bits 20:5 of MOVZ/MOVK instruction
+-    *TargetPtr |= ((Result & 0xffff0000U) >> (16 - 5));
++    TargetValue |= ((Result & 0xffff0000U) >> (16 - 5));
+     // Shift must be "lsl #16", in bits 22:2
+-    assert((*TargetPtr >> 21 & 0x3) == 1 && "invalid shift for relocation");
++    assert((TargetValue >> 21 & 0x3) == 1 && "invalid shift for relocation");
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_MOVW_UABS_G0_NC: {
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffe0001fU;
++    TargetValue &= 0xffe0001fU;
+     // Immediate goes in bits 20:5 of MOVZ/MOVK instruction
+-    *TargetPtr |= ((Result & 0xffffU) << 5);
++    TargetValue |= ((Result & 0xffffU) << 5);
+     // Shift must be "lsl #0", in bits 22:21.
+-    assert((*TargetPtr >> 21 & 0x3) == 0 && "invalid shift for relocation");
++    assert((TargetValue >> 21 & 0x3) == 0 && "invalid shift for relocation");
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_ADR_PREL_PG_HI21: {
+     // Operation: Page(S+A) - Page(P)
+     uint64_t Result =
+         ((Value + Addend) & ~0xfffULL) - (FinalAddress & ~0xfffULL);
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // Check that -2^32 <= X < 2^32
+     assert(isInt<33>(Result) && "overflow check failed for relocation");
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0x9f00001fU;
++    TargetValue &= 0x9f00001fU;
+     // Immediate goes in bits 30:29 + 5:23 of ADRP instruction, taken
+     // from bits 32:12 of X.
+-    *TargetPtr |= ((Result & 0x3000U) << (29 - 12));
+-    *TargetPtr |= ((Result & 0x1ffffc000ULL) >> (14 - 5));
++    TargetValue |= ((Result & 0x3000U) << (29 - 12));
++    TargetValue |= ((Result & 0x1ffffc000ULL) >> (14 - 5));
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_LDST32_ABS_LO12_NC: {
+     // Operation: S + A
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffc003ffU;
++    TargetValue &= 0xffc003ffU;
+     // Immediate goes in bits 21:10 of LD/ST instruction, taken
+     // from bits 11:2 of X
+-    *TargetPtr |= ((Result & 0xffc) << (10 - 2));
++    TargetValue |= ((Result & 0xffc) << (10 - 2));
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   case ELF::R_AARCH64_LDST64_ABS_LO12_NC: {
+     // Operation: S + A
+     uint64_t Result = Value + Addend;
++    uint32_t TargetValue = support::ulittle32_t::ref{TargetPtr};
+ 
+     // AArch64 code is emitted with .rela relocations. The data already in any
+     // bits affected by the relocation on entry is garbage.
+-    *TargetPtr &= 0xffc003ffU;
++    TargetValue &= 0xffc003ffU;
+     // Immediate goes in bits 21:10 of LD/ST instruction, taken
+     // from bits 11:3 of X
+-    *TargetPtr |= ((Result & 0xff8) << (10 - 3));
++    TargetValue |= ((Result & 0xff8) << (10 - 3));
++    support::ulittle32_t::ref{TargetPtr} = TargetValue;
+     break;
+   }
+   }
+diff --git a/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-relocations.s b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-relocations.s
+new file mode 100644
+index 0000000..3ba95e4
+--- /dev/null
++++ b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-relocations.s
+@@ -0,0 +1,34 @@
++# RUN: llvm-mc -triple=aarch64_be-none-linux-gnu -filetype=obj -o %T/be-reloc.o %s
++# RUN: llvm-rtdyld -triple=aarch64_be-none-linux-gnu -verify -dummy-extern f=0x0123456789abcdef -check=%s %T/be-reloc.o
++
++        .text
++        .globl  g
++        .p2align        2
++        .type   g,@function
++g:
++# R_AARCH64_MOVW_UABS_G3
++        movz    x0, #:abs_g3:f
++# R_AARCH64_MOVW_UABS_G2_NC
++        movk    x0, #:abs_g2_nc:f
++# R_AARCH64_MOVW_UABS_G1_NC
++        movk    x0, #:abs_g1_nc:f
++# R_AARCH64_MOVW_UABS_G0_NC
++        movk    x0, #:abs_g0_nc:f
++        ret
++        .Lfunc_end0:
++        .size   g, .Lfunc_end0-g
++
++        .type   k,@object
++        .data
++        .globl  k
++        .p2align        3
++k:
++        .xword  f
++        .size   k, 8
++
++# LE instructions read as BE
++# rtdyld-check: *{4}(g) = 0x6024e0d2
++# rtdyld-check: *{4}(g + 4) = 0xe0acc8f2
++# rtdyld-check: *{4}(g + 8) = 0x6035b1f2
++# rtdyld-check: *{4}(g + 12) = 0xe0bd99f2
++# rtdyld-check: *{8}k = f
+diff --git a/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_relocations.s b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_relocations.s
+new file mode 100644
+index 0000000..f83f6bf
+--- /dev/null
++++ b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_relocations.s
+@@ -0,0 +1,33 @@
++# RUN: llvm-mc -triple=arm64-none-linux-gnu -filetype=obj -o %T/reloc.o %s
++# RUN: llvm-rtdyld -triple=arm64-none-linux-gnu -verify -dummy-extern f=0x0123456789abcdef -check=%s %T/reloc.o
++
++        .text
++        .globl  g
++        .p2align        2
++        .type   g,@function
++g:
++# R_AARCH64_MOVW_UABS_G3
++        movz    x0, #:abs_g3:f
++# R_AARCH64_MOVW_UABS_G2_NC
++        movk    x0, #:abs_g2_nc:f
++# R_AARCH64_MOVW_UABS_G1_NC
++        movk    x0, #:abs_g1_nc:f
++# R_AARCH64_MOVW_UABS_G0_NC
++        movk    x0, #:abs_g0_nc:f
++        ret
++        .Lfunc_end0:
++        .size   g, .Lfunc_end0-g
++
++        .type   k,@object
++        .data
++        .globl  k
++        .p2align        3
++k:
++        .xword  f
++        .size   k, 8
++
++# rtdyld-check: *{4}(g) = 0xd2e02460
++# rtdyld-check: *{4}(g + 4) = 0xf2c8ace0
++# rtdyld-check: *{4}(g + 8) = 0xf2b13560
++# rtdyld-check: *{4}(g + 12) = 0xf299bde0
++# rtdyld-check: *{8}k = f
+-- 
+2.10.2
+
diff --git a/llvm-D27629-AArch64-large_model.patch b/llvm-D27629-AArch64-large_model.patch
new file mode 100644
index 0000000..ed28853
--- /dev/null
+++ b/llvm-D27629-AArch64-large_model.patch
@@ -0,0 +1,100 @@
+From 073a3b4c0e422396016ddea15181411e45c96af5 Mon Sep 17 00:00:00 2001
+From: Yichao Yu <yyc1992@gmail.com>
+Date: Fri, 9 Dec 2016 15:59:46 -0500
+Subject: [PATCH 2/2] Fix unwind info relocation with large code model on
+ AArch64
+
+---
+ lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp     | 10 ++++++++++
+ lib/MC/MCObjectFileInfo.cpp                            |  2 ++
+ .../AArch64/ELF_ARM64_BE-large-relocations.s           | 18 ++++++++++++++++++
+ .../RuntimeDyld/AArch64/ELF_ARM64_large-relocations.s  | 18 ++++++++++++++++++
+ 4 files changed, 48 insertions(+)
+ create mode 100644 test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-large-relocations.s
+ create mode 100644 test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_large-relocations.s
+
+diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+index a977dce..2a832f8 100644
+--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
++++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+@@ -360,6 +360,16 @@ void RuntimeDyldELF::resolveAArch64Relocation(const SectionEntry &Section,
+         static_cast<uint32_t>(Result & 0xffffffffU);
+     break;
+   }
++  case ELF::R_AARCH64_PREL64: {
++    uint64_t *TargetPtr =
++        reinterpret_cast<uint64_t *>(Section.getAddressWithOffset(Offset));
++    uint64_t Result = Value + Addend - FinalAddress;
++    if (isBE)
++      support::ubig64_t::ref{TargetPtr} = Result;
++    else
++      support::ulittle64_t::ref{TargetPtr} = Result;
++    break;
++  }
+   case ELF::R_AARCH64_CALL26: // fallthrough
+   case ELF::R_AARCH64_JUMP26: {
+     // Operation: S+A-P. Set Call or B immediate value to bits fff_fffc of the
+diff --git a/lib/MC/MCObjectFileInfo.cpp b/lib/MC/MCObjectFileInfo.cpp
+index 8fd71f6..bcf774e 100644
+--- a/lib/MC/MCObjectFileInfo.cpp
++++ b/lib/MC/MCObjectFileInfo.cpp
+@@ -279,6 +279,8 @@ void MCObjectFileInfo::initELFMCObjectFileInfo(const Triple &T) {
+   case Triple::mips64el:
+     FDECFIEncoding = dwarf::DW_EH_PE_sdata8;
+     break;
++  case Triple::aarch64:
++  case Triple::aarch64_be:
+   case Triple::x86_64:
+     FDECFIEncoding = dwarf::DW_EH_PE_pcrel |
+                      ((CMModel == CodeModel::Large) ? dwarf::DW_EH_PE_sdata8
+diff --git a/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-large-relocations.s b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-large-relocations.s
+new file mode 100644
+index 0000000..e3eeb02
+--- /dev/null
++++ b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_BE-large-relocations.s
+@@ -0,0 +1,18 @@
++# RUN: llvm-mc -triple=aarch64_be-none-linux-gnu -code-model=large -filetype=obj -o %T/be-large-reloc.o %s
++# RUN: llvm-rtdyld -triple=aarch64_be-none-linux-gnu -verify -map-section be-large-reloc.o,.eh_frame=0x10000 -map-section be-large-reloc.o,.text=0xffff000000000000 -check=%s %T/be-large-reloc.o
++
++        .text
++        .globl  g
++        .p2align        2
++        .type   g,@function
++g:
++        .cfi_startproc
++        mov      x0, xzr
++        ret
++        .Lfunc_end0:
++        .size   g, .Lfunc_end0-g
++        .cfi_endproc
++
++# Skip the CIE and load the 8 bytes PC begin pointer.
++# Assuming the CIE and the FDE length are both 4 bytes.
++# rtdyld-check: *{8}(section_addr(be-large-reloc.o, .eh_frame) + (*{4}(section_addr(be-large-reloc.o, .eh_frame))) + 0xc) = g - (section_addr(be-large-reloc.o, .eh_frame) + (*{4}(section_addr(be-large-reloc.o, .eh_frame))) + 0xc)
+diff --git a/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_large-relocations.s b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_large-relocations.s
+new file mode 100644
+index 0000000..ec30f19
+--- /dev/null
++++ b/test/ExecutionEngine/RuntimeDyld/AArch64/ELF_ARM64_large-relocations.s
+@@ -0,0 +1,18 @@
++# RUN: llvm-mc -triple=arm64-none-linux-gnu -code-model=large -filetype=obj -o %T/large-reloc.o %s
++# RUN: llvm-rtdyld -triple=arm64-none-linux-gnu -verify -map-section large-reloc.o,.eh_frame=0x10000 -map-section large-reloc.o,.text=0xffff000000000000 -check=%s %T/large-reloc.o
++
++        .text
++        .globl  g
++        .p2align        2
++        .type   g,@function
++g:
++        .cfi_startproc
++        mov      x0, xzr
++        ret
++        .Lfunc_end0:
++        .size   g, .Lfunc_end0-g
++        .cfi_endproc
++
++# Skip the CIE and load the 8 bytes PC begin pointer.
++# Assuming the CIE and the FDE length are both 4 bytes.
++# rtdyld-check: *{8}(section_addr(large-reloc.o, .eh_frame) + (*{4}(section_addr(large-reloc.o, .eh_frame))) + 0xc) = g - (section_addr(large-reloc.o, .eh_frame) + (*{4}(section_addr(large-reloc.o, .eh_frame))) + 0xc)
+-- 
+2.10.2
+
diff --git a/llvm-D28009.patch b/llvm-D28009.patch
new file mode 100644
index 0000000..ceba0b1
--- /dev/null
+++ b/llvm-D28009.patch
@@ -0,0 +1,68 @@
+commit 57ab82784ddb8d21eb0041d52f8490d8fd404e29
+Author: Michael Kuperstein <mkuper@google.com>
+Date:   Wed Dec 21 17:34:21 2016 +0000
+
+    [ConstantFolding] Fix vector GEPs harder
+    
+    For vector GEPs, CastGEPIndices can end up in an infinite recursion, because
+    we compare the vector type to the scalar pointer type, find them different,
+    and then try to cast a type to itself.
+    
+    Differential Revision: https://reviews.llvm.org/D28009
+    
+    
+    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290260 91177308-0d34-0410-b5e6-96231b3b80d8
+
+diff --git a/lib/Analysis/ConstantFolding.cpp b/lib/Analysis/ConstantFolding.cpp
+index cf0d5e4..9e521e1 100644
+--- a/lib/Analysis/ConstantFolding.cpp
++++ b/lib/Analysis/ConstantFolding.cpp
+@@ -742,13 +742,16 @@ Constant *CastGEPIndices(Type *SrcElemTy, ArrayRef<Constant *> Ops,
+     if ((i == 1 ||
+          !isa<StructType>(GetElementPtrInst::getIndexedType(
+              SrcElemTy, Ops.slice(1, i - 1)))) &&
+-        Ops[i]->getType() != (i == 1 ? IntPtrTy : IntPtrScalarTy)) {
++        Ops[i]->getType()->getScalarType() != IntPtrScalarTy) {
+       Any = true;
++      Type *NewType = Ops[i]->getType()->isVectorTy()
++                          ? IntPtrTy
++                          : IntPtrTy->getScalarType();
+       NewIdxs.push_back(ConstantExpr::getCast(CastInst::getCastOpcode(Ops[i],
+                                                                       true,
+-                                                                      IntPtrTy,
++                                                                      NewType,
+                                                                       true),
+-                                              Ops[i], IntPtrTy));
++                                              Ops[i], NewType));
+     } else
+       NewIdxs.push_back(Ops[i]);
+   }
+diff --git a/test/Analysis/ConstantFolding/vectorgep-crash.ll b/test/Analysis/ConstantFolding/vectorgep-crash.ll
+index bcc96b2..e7a5117 100644
+--- a/test/Analysis/ConstantFolding/vectorgep-crash.ll
++++ b/test/Analysis/ConstantFolding/vectorgep-crash.ll
+@@ -17,3 +17,24 @@ top:
+   %0 = bitcast <8 x double*> %VectorGep14 to <8 x i64*>
+   ret <8 x i64*> %0
+ }
++
++%struct.A = type { i32, %struct.B* }
++%struct.B = type { i64, %struct.C* }
++%struct.C = type { i64 }
++
++@G = internal global [65 x %struct.A] zeroinitializer, align 16
++; CHECK-LABEL: @test
++; CHECK: ret <16 x i32*> getelementptr ([65 x %struct.A], [65 x %struct.A]* @G, <16 x i64> zeroinitializer, <16 x i64> <i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12, i64 13, i64 14, i64 15, i64 16>, <16 x i32> zeroinitializer)
++define <16 x i32*> @test() {
++vector.body:
++  %VectorGep = getelementptr [65 x %struct.A], [65 x %struct.A]* @G, <16 x i64> zeroinitializer, <16 x i64> <i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12, i64 13, i64 14, i64 15, i64 16>, <16 x i32> zeroinitializer
++  ret <16 x i32*> %VectorGep
++}
++
++; CHECK-LABEL: @test2
++; CHECK: ret <16 x i32*> getelementptr ([65 x %struct.A], [65 x %struct.A]* @G, <16 x i64> zeroinitializer, <16 x i64> <i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, 
++define <16 x i32*> @test2() {
++vector.body:
++  %VectorGep = getelementptr [65 x %struct.A], [65 x %struct.A]* @G, <16 x i32> zeroinitializer, <16 x i64> <i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9, i64 10, i64 11, i64 12, i64 13, i64 14, i64 15, i64 16>, <16 x i32> zeroinitializer
++  ret <16 x i32*> %VectorGep
++}
diff --git a/llvm-D9168_argument_alignment.patch b/llvm-D9168_argument_alignment.patch
new file mode 100644
index 0000000..8166cc3
--- /dev/null
+++ b/llvm-D9168_argument_alignment.patch
@@ -0,0 +1,98 @@
+Index: lib/Target/NVPTX/NVPTXISelLowering.h
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXISelLowering.h
++++ b/lib/Target/NVPTX/NVPTXISelLowering.h
+@@ -539,7 +539,8 @@
+   SDValue PerformDAGCombine(SDNode *N, DAGCombinerInfo &DCI) const override;
+ 
+   unsigned getArgumentAlignment(SDValue Callee, const ImmutableCallSite *CS,
+-                                Type *Ty, unsigned Idx) const;
++                                Type *Ty, unsigned Idx,
++                                const DataLayout &DL) const;
+ };
+ } // namespace llvm
+ 
+Index: lib/Target/NVPTX/NVPTXISelLowering.cpp
+===================================================================
+--- a/lib/Target/NVPTX/NVPTXISelLowering.cpp
++++ b/lib/Target/NVPTX/NVPTXISelLowering.cpp
+@@ -1024,11 +1024,15 @@
+   return O.str();
+ }
+ 
+-unsigned
+-NVPTXTargetLowering::getArgumentAlignment(SDValue Callee,
+-                                          const ImmutableCallSite *CS,
+-                                          Type *Ty,
+-                                          unsigned Idx) const {
++unsigned NVPTXTargetLowering::getArgumentAlignment(SDValue Callee,
++                                                   const ImmutableCallSite *CS,
++                                                   Type *Ty, unsigned Idx,
++                                                   const DataLayout &DL) const {
++  if (!CS) {
++    // CallSite is zero, fallback to ABI type alignment
++    return DL.getABITypeAlignment(Ty);
++  }
++
+   unsigned Align = 0;
+   const Value *DirectCallee = CS->getCalledFunction();
+ 
+@@ -1046,7 +1050,7 @@
+ 
+       const Value *CalleeV = cast<CallInst>(CalleeI)->getCalledValue();
+       // Ignore any bitcast instructions
+-      while(isa<ConstantExpr>(CalleeV)) {
++      while (isa<ConstantExpr>(CalleeV)) {
+         const ConstantExpr *CE = cast<ConstantExpr>(CalleeV);
+         if (!CE->isCast())
+           break;
+@@ -1069,7 +1073,6 @@
+ 
+   // Call is indirect or alignment information is not available, fall back to
+   // the ABI type alignment
+-  auto &DL = CS->getCaller()->getParent()->getDataLayout();
+   return DL.getABITypeAlignment(Ty);
+ }
+ 
+@@ -1126,7 +1129,8 @@
+         ComputePTXValueVTs(*this, DAG.getDataLayout(), Ty, vtparts, &Offsets,
+                            0);
+ 
+-        unsigned align = getArgumentAlignment(Callee, CS, Ty, paramCount + 1);
++        unsigned align =
++            getArgumentAlignment(Callee, CS, Ty, paramCount + 1, DL);
+         // declare .param .align <align> .b8 .param<n>[<size>];
+         unsigned sz = DL.getTypeAllocSize(Ty);
+         SDVTList DeclareParamVTs = DAG.getVTList(MVT::Other, MVT::Glue);
+@@ -1166,7 +1170,8 @@
+       }
+       if (Ty->isVectorTy()) {
+         EVT ObjectVT = getValueType(DL, Ty);
+-        unsigned align = getArgumentAlignment(Callee, CS, Ty, paramCount + 1);
++        unsigned align =
++            getArgumentAlignment(Callee, CS, Ty, paramCount + 1, DL);
+         // declare .param .align <align> .b8 .param<n>[<size>];
+         unsigned sz = DL.getTypeAllocSize(Ty);
+         SDVTList DeclareParamVTs = DAG.getVTList(MVT::Other, MVT::Glue);
+@@ -1426,7 +1431,7 @@
+                           DeclareRetOps);
+       InFlag = Chain.getValue(1);
+     } else {
+-      retAlignment = getArgumentAlignment(Callee, CS, retTy, 0);
++      retAlignment = getArgumentAlignment(Callee, CS, retTy, 0, DL);
+       SDVTList DeclareRetVTs = DAG.getVTList(MVT::Other, MVT::Glue);
+       SDValue DeclareRetOps[] = { Chain,
+                                   DAG.getConstant(retAlignment, dl, MVT::i32),
+@@ -1633,9 +1638,10 @@
+     } else {
+       SmallVector<EVT, 16> VTs;
+       SmallVector<uint64_t, 16> Offsets;
+-      ComputePTXValueVTs(*this, DAG.getDataLayout(), retTy, VTs, &Offsets, 0);
++      auto &DL = DAG.getDataLayout();
++      ComputePTXValueVTs(*this, DL, retTy, VTs, &Offsets, 0);
+       assert(VTs.size() == Ins.size() && "Bad value decomposition");
+-      unsigned RetAlign = getArgumentAlignment(Callee, CS, retTy, 0);
++      unsigned RetAlign = getArgumentAlignment(Callee, CS, retTy, 0, DL);
+       for (unsigned i = 0, e = Ins.size(); i != e; ++i) {
+         unsigned sz = VTs[i].getSizeInBits();
+         unsigned AlignI = GreatestCommonDivisor64(RetAlign, Offsets[i]);
diff --git a/llvm-PR22923.patch b/llvm-PR22923.patch
new file mode 100644
index 0000000..c48533b
--- /dev/null
+++ b/llvm-PR22923.patch
@@ -0,0 +1,151 @@
+From e060ffb4b20e294ecb8429bd8a925f9f12b63b17 Mon Sep 17 00:00:00 2001
+From: Hal Finkel <hfinkel@anl.gov>
+Date: Mon, 29 Aug 2016 22:25:36 +0000
+Subject: [PATCH] [PowerPC] Fix i8/i16 atomics for little-Endian targets
+ without partword atomics
+
+For little-Endian PowerPC, we generally target only P8 and later by default.
+However, generic (older) 64-bit configurations are still an option, and in that
+case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics
+without true i8/i16 atomic operations, we emulate using i32 atomics in
+combination with a bunch of shifting and masking, etc. The amount by which to
+shift in little-Endian mode is different from the amount in big-Endian mode (it
+is inverted -- meaning we can leave off the xor when computing the amount).
+
+Fixes PR22923.
+
+git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280022 91177308-0d34-0410-b5e6-96231b3b80d8
+---
+ lib/Target/PowerPC/PPCISelLowering.cpp | 18 ++++++++++++------
+ test/CodeGen/PowerPC/atomic-2.ll       | 15 ++++++++++++++-
+ 2 files changed, 26 insertions(+), 7 deletions(-)
+
+diff --git a/lib/Target/PowerPC/PPCISelLowering.cpp b/lib/Target/PowerPC/PPCISelLowering.cpp
+index e89b6ca..f895b06 100644
+--- a/lib/Target/PowerPC/PPCISelLowering.cpp
++++ b/lib/Target/PowerPC/PPCISelLowering.cpp
+@@ -8513,6 +8513,7 @@ PPCTargetLowering::EmitPartwordAtomicBinary(MachineInstr &MI,
+   // registers without caring whether they're 32 or 64, but here we're
+   // doing actual arithmetic on the addresses.
+   bool is64bit = Subtarget.isPPC64();
++  bool isLittleEndian = Subtarget.isLittleEndian();
+   unsigned ZeroReg = is64bit ? PPC::ZERO8 : PPC::ZERO;
+ 
+   const BasicBlock *LLVM_BB = BB->getBasicBlock();
+@@ -8542,7 +8543,8 @@ PPCTargetLowering::EmitPartwordAtomicBinary(MachineInstr &MI,
+                                           : &PPC::GPRCRegClass;
+   unsigned PtrReg = RegInfo.createVirtualRegister(RC);
+   unsigned Shift1Reg = RegInfo.createVirtualRegister(RC);
+-  unsigned ShiftReg = RegInfo.createVirtualRegister(RC);
++  unsigned ShiftReg =
++    isLittleEndian ? Shift1Reg : RegInfo.createVirtualRegister(RC);
+   unsigned Incr2Reg = RegInfo.createVirtualRegister(RC);
+   unsigned MaskReg = RegInfo.createVirtualRegister(RC);
+   unsigned Mask2Reg = RegInfo.createVirtualRegister(RC);
+@@ -8587,8 +8589,9 @@ PPCTargetLowering::EmitPartwordAtomicBinary(MachineInstr &MI,
+   }
+   BuildMI(BB, dl, TII->get(PPC::RLWINM), Shift1Reg).addReg(Ptr1Reg)
+       .addImm(3).addImm(27).addImm(is8bit ? 28 : 27);
+-  BuildMI(BB, dl, TII->get(is64bit ? PPC::XORI8 : PPC::XORI), ShiftReg)
+-      .addReg(Shift1Reg).addImm(is8bit ? 24 : 16);
++  if (!isLittleEndian)
++    BuildMI(BB, dl, TII->get(is64bit ? PPC::XORI8 : PPC::XORI), ShiftReg)
++        .addReg(Shift1Reg).addImm(is8bit ? 24 : 16);
+   if (is64bit)
+     BuildMI(BB, dl, TII->get(PPC::RLDICR), PtrReg)
+       .addReg(Ptr1Reg).addImm(0).addImm(61);
+@@ -9293,6 +9296,7 @@ PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
+     // since we're actually doing arithmetic on them.  Other registers
+     // can be 32-bit.
+     bool is64bit = Subtarget.isPPC64();
++    bool isLittleEndian = Subtarget.isLittleEndian();
+     bool is8bit = MI.getOpcode() == PPC::ATOMIC_CMP_SWAP_I8;
+ 
+     unsigned dest = MI.getOperand(0).getReg();
+@@ -9319,7 +9323,8 @@ PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
+                                             : &PPC::GPRCRegClass;
+     unsigned PtrReg = RegInfo.createVirtualRegister(RC);
+     unsigned Shift1Reg = RegInfo.createVirtualRegister(RC);
+-    unsigned ShiftReg = RegInfo.createVirtualRegister(RC);
++    unsigned ShiftReg =
++      isLittleEndian ? Shift1Reg : RegInfo.createVirtualRegister(RC);
+     unsigned NewVal2Reg = RegInfo.createVirtualRegister(RC);
+     unsigned NewVal3Reg = RegInfo.createVirtualRegister(RC);
+     unsigned OldVal2Reg = RegInfo.createVirtualRegister(RC);
+@@ -9374,8 +9379,9 @@ PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
+     }
+     BuildMI(BB, dl, TII->get(PPC::RLWINM), Shift1Reg).addReg(Ptr1Reg)
+         .addImm(3).addImm(27).addImm(is8bit ? 28 : 27);
+-    BuildMI(BB, dl, TII->get(is64bit ? PPC::XORI8 : PPC::XORI), ShiftReg)
+-        .addReg(Shift1Reg).addImm(is8bit ? 24 : 16);
++    if (!isLittleEndian)
++      BuildMI(BB, dl, TII->get(is64bit ? PPC::XORI8 : PPC::XORI), ShiftReg)
++          .addReg(Shift1Reg).addImm(is8bit ? 24 : 16);
+     if (is64bit)
+       BuildMI(BB, dl, TII->get(PPC::RLDICR), PtrReg)
+         .addReg(Ptr1Reg).addImm(0).addImm(61);
+diff --git a/test/CodeGen/PowerPC/atomic-2.ll b/test/CodeGen/PowerPC/atomic-2.ll
+index 1857d5d..bafabdb 100644
+--- a/test/CodeGen/PowerPC/atomic-2.ll
++++ b/test/CodeGen/PowerPC/atomic-2.ll
+@@ -1,4 +1,5 @@
+-; RUN: llc < %s -march=ppc64 | FileCheck %s
++; RUN: llc < %s -march=ppc64 | FileCheck %s -check-prefix=CHECK -check-prefix=CHECK-BE
++; RUN: llc < %s -march=ppc64le | FileCheck %s -check-prefix=CHECK -check-prefix=CHECK-LE
+ ; RUN: llc < %s -march=ppc64 -mcpu=pwr7 | FileCheck %s
+ ; RUN: llc < %s -march=ppc64 -mcpu=pwr8 | FileCheck %s -check-prefix=CHECK-P8U
+ 
+@@ -12,6 +13,8 @@ define i64 @exchange_and_add(i64* %mem, i64 %val) nounwind {
+ 
+ define i8 @exchange_and_add8(i8* %mem, i8 %val) nounwind {
+ ; CHECK-LABEL: exchange_and_add8:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lbarx
+   %tmp = atomicrmw add i8* %mem, i8 %val monotonic
+ ; CHECK-P8U: stbcx.
+@@ -20,6 +23,8 @@ define i8 @exchange_and_add8(i8* %mem, i8 %val) nounwind {
+ 
+ define i16 @exchange_and_add16(i16* %mem, i16 %val) nounwind {
+ ; CHECK-LABEL: exchange_and_add16:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lharx
+   %tmp = atomicrmw add i16* %mem, i16 %val monotonic
+ ; CHECK-P8U: sthcx.
+@@ -38,6 +43,8 @@ define i64 @exchange_and_cmp(i64* %mem) nounwind {
+ 
+ define i8 @exchange_and_cmp8(i8* %mem) nounwind {
+ ; CHECK-LABEL: exchange_and_cmp8:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lbarx
+   %tmppair = cmpxchg i8* %mem, i8 0, i8 1 monotonic monotonic
+   %tmp = extractvalue { i8, i1 } %tmppair, 0
+@@ -48,6 +55,8 @@ define i8 @exchange_and_cmp8(i8* %mem) nounwind {
+ 
+ define i16 @exchange_and_cmp16(i16* %mem) nounwind {
+ ; CHECK-LABEL: exchange_and_cmp16:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lharx
+   %tmppair = cmpxchg i16* %mem, i16 0, i16 1 monotonic monotonic
+   %tmp = extractvalue { i16, i1 } %tmppair, 0
+@@ -66,6 +75,8 @@ define i64 @exchange(i64* %mem, i64 %val) nounwind {
+ 
+ define i8 @exchange8(i8* %mem, i8 %val) nounwind {
+ ; CHECK-LABEL: exchange8:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lbarx
+   %tmp = atomicrmw xchg i8* %mem, i8 1 monotonic
+ ; CHECK-P8U: stbcx.
+@@ -74,6 +85,8 @@ define i8 @exchange8(i8* %mem, i8 %val) nounwind {
+ 
+ define i16 @exchange16(i16* %mem, i16 %val) nounwind {
+ ; CHECK-LABEL: exchange16:
++; CHECK-BE: xori
++; CHECK-LE-NOT: xori
+ ; CHECK-P8U: lharx
+   %tmp = atomicrmw xchg i16* %mem, i16 1 monotonic
+ ; CHECK-P8U: sthcx.
diff --git a/llvm-arm-fix-prel31.patch b/llvm-arm-fix-prel31.patch
new file mode 100644
index 0000000..a823ce6
--- /dev/null
+++ b/llvm-arm-fix-prel31.patch
@@ -0,0 +1,60 @@
+From 6cef9adffcf9af3c632e58e0d7d4d6e1d0525980 Mon Sep 17 00:00:00 2001
+From: Yichao Yu <yyc1992@gmail.com>
+Date: Thu, 29 Sep 2016 22:41:57 -0400
+Subject: [PATCH] Fix PREL31 relocation on ARM
+
+This is a 31bits relative relocation instead of a 32bits absolute relocation.
+---
+ lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp |  4 ++++
+ .../RuntimeDyld/ARM/ELF_ARM_EXIDX_relocations.s    | 23 ++++++++++++++++++++++
+ 2 files changed, 27 insertions(+)
+ create mode 100644 test/ExecutionEngine/RuntimeDyld/ARM/ELF_ARM_EXIDX_relocations.s
+
+diff --git a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+index 6929732..2e0d168 100644
+--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
++++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+@@ -463,7 +463,11 @@ void RuntimeDyldELF::resolveARMRelocation(const SectionEntry &Section,
+ 
+   case ELF::R_ARM_NONE:
+     break;
++    // Write a 31bit signed offset
+   case ELF::R_ARM_PREL31:
++    *TargetPtr &= 0x80000000;
++    *TargetPtr |= (Value - FinalAddress) & ~0x80000000;
++    break;
+   case ELF::R_ARM_TARGET1:
+   case ELF::R_ARM_ABS32:
+     *TargetPtr = Value;
+diff --git a/test/ExecutionEngine/RuntimeDyld/ARM/ELF_ARM_EXIDX_relocations.s b/test/ExecutionEngine/RuntimeDyld/ARM/ELF_ARM_EXIDX_relocations.s
+new file mode 100644
+index 0000000..eb07b00
+--- /dev/null
++++ b/test/ExecutionEngine/RuntimeDyld/ARM/ELF_ARM_EXIDX_relocations.s
+@@ -0,0 +1,23 @@
++# RUN: llvm-mc -triple=arm-linux-gnueabihf -filetype=obj -o %T/reloc.o %s
++# RUN: llvm-rtdyld -triple=arm-linux-gnueabihf -verify -map-section reloc.o,.ARM.exidx=0x6000 -map-section reloc.o,.text=0x4000  -dummy-extern __aeabi_unwind_cpp_pr0=0x1234 -check=%s %T/reloc.o
++
++        .text
++        .syntax unified
++        .eabi_attribute 67, "2.09"      @ Tag_conformance
++        .cpu    cortex-a8
++        .fpu    neon
++        .file   "reloc.c"
++        .globl  g
++        .align  2
++        .type   g,%function
++g:
++        .fnstart
++        movw    r0, #1
++        bx      lr
++        .Lfunc_end0:
++        .size   g, .Lfunc_end0-g
++        .fnend
++
++# rtdyld-check: *{4}(section_addr(reloc.o, .ARM.exidx)) = (g - (section_addr(reloc.o, .ARM.exidx))) & 0x7fffffff
++# Compat unwind info: finish(0xb0), finish(0xb0), finish(0xb0)
++# rtdyld-check: *{4}(section_addr(reloc.o, .ARM.exidx) + 0x4) = 0x80b0b0b0
+-- 
+2.10.0
+
diff --git a/llvm-config.h b/llvm-config.h
new file mode 100644
index 0000000..2fa08c9
--- /dev/null
+++ b/llvm-config.h
@@ -0,0 +1,9 @@
+#include <bits/wordsize.h>
+
+#if __WORDSIZE == 32
+#include "llvm-config-32.h"
+#elif __WORDSIZE == 64
+#include "llvm-config-64.h"
+#else
+#error "Unknown word size"
+#endif
diff --git a/llvm-soversion.patch b/llvm-soversion.patch
new file mode 100644
index 0000000..dde8d26
--- /dev/null
+++ b/llvm-soversion.patch
@@ -0,0 +1,23 @@
+Index: cmake/modules/AddLLVM.cmake
+===================================================================
+--- cmake/modules/AddLLVM.cmake	(revision 283188)
++++ cmake/modules/AddLLVM.cmake	(revision 283189)
+@@ -450,6 +450,18 @@
+         PREFIX ""
+         )
+     endif()
++
++    # Set SOVERSION on shared libraries that lack explicit SONAME
++    # specifier, on *nix systems that are not Darwin.
++    if(UNIX AND NOT APPLE AND NOT ARG_SONAME)
++      set_target_properties(${name}
++        PROPERTIES
++		# Concatenate the version numbers since ldconfig expects exactly
++		# one component indicating the ABI version, while LLVM uses
++		# major+minor for that.
++        SOVERSION ${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}
++        VERSION ${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}.${LLVM_VERSION_PATCH}${LLVM_VERSION_SUFFIX})
++    endif()
+   endif()
+ 
+   if(ARG_MODULE OR ARG_SHARED)
diff --git a/llvm3.9.spec b/llvm3.9.spec
new file mode 100644
index 0000000..afdb820
--- /dev/null
+++ b/llvm3.9.spec
@@ -0,0 +1,385 @@
+# Components enabled if supported by target architecture:
+%ifarch %ix86 x86_64
+	%bcond_without gold
+%else
+	%bcond_with gold
+%endif
+
+%global major_version 3.9
+
+Name:		llvm%{major_version}
+Version:	%{major_version}.1
+Release:	4%{?dist}
+Summary:	The Low Level Virtual Machine Version %{major_version}
+
+License:	NCSA
+URL:		http://llvm.org
+Source0:	http://llvm.org/releases/%{version}/llvm-%{version}.src.tar.xz
+
+Source100:	llvm-config.h
+
+# recognize s390 as SystemZ when configuring build
+Patch0:		llvm-3.7.1-cmake-s390.patch
+
+Patch1:		0001-This-code-block-breaks-the-docs-build-http-lab.llvm..patch
+Patch2:		0001-fix-docs-2.patch
+Patch3:		0001-fix-docs-3.patch
+Patch4:		0001-docs-fix-cmake-code-block-warning.patch
+# backport from upstream to fix lldb out of tree
+Patch5:		0001-cmake-Install-CheckAtomic.cmake-needed-by-lldb.patch
+
+Patch6:		install_dirs.patch
+
+# Add SOVERSION (modified to add . between version numbers)
+# https://reviews.llvm.org/rL283189
+Patch7:		llvm-soversion.patch
+
+Patch71:	llvm-arm-fix-prel31.patch
+Patch72:	llvm-D23597_sdag_names.patch
+Patch73:	llvm-D24300_ptx_intrinsics.patch
+Patch74:	llvm-D25865-cmakeshlib.patch
+Patch75:	llvm-D27609-AArch64-UABS_G3.patch
+Patch76:	llvm-D27629-AArch64-large_model.patch
+Patch77:	llvm-D9168_argument_alignment.patch
+Patch78:	llvm-PR22923.patch
+Patch80:	llvm-D27389.patch
+Patch81:	llvm-D27397.patch
+Patch82:	llvm-D28009.patch
+
+BuildRequires:	cmake3
+BuildRequires:	zlib-devel
+BuildRequires:	libedit-devel
+BuildRequires:	libffi-devel
+BuildRequires:	ncurses-devel
+BuildRequires:	python-sphinx
+BuildRequires:	valgrind-devel
+%if %{with gold}
+BuildRequires:	binutils-devel
+%endif
+BuildRequires:	libstdc++-static
+
+Requires:	%{name}-libs%{?_isa} = %{version}-%{release}
+
+%description
+LLVM is a compiler infrastructure designed for compile-time, link-time,
+runtime, and idle-time optimization of programs from arbitrary programming
+languages. The compiler infrastructure includes mirror sets of programming
+tools as well as libraries with equivalent functionality.
+
+This package contains LLVM %{major_version} and can be installed
+in parallel with other LLVM versions.
+
+
+%package devel
+Summary:	Libraries and header files for LLVM %{major_version}
+Requires:	%{name}%{?_isa} = %{version}-%{release}
+
+%description devel
+This package contains library and header files needed to develop new native
+programs that use the LLVM infrastructure.
+
+This package contains LLVM %{major_version} and can be installed
+in parallel with other LLVM versions.
+
+%package doc
+Summary:	Documentation for LLVM %{major_version}
+BuildArch:	noarch
+Requires:	%{name} = %{version}-%{release}
+
+%description doc
+Documentation for the LLVM compiler infrastructure.
+
+This package contains LLVM %{major_version} and can be installed
+in parallel with other LLVM versions.
+
+%package libs
+Summary:	LLVM %{major_version} shared libraries
+
+%description libs
+Shared libraries for the LLVM compiler infrastructure.
+
+This package contains LLVM %{major_version} and can be installed
+in parallel with other LLVM versions.
+
+%package static
+Summary:	LLVM %{major_version} static libraries
+Requires:	%{name}-devel%{?_isa} = %{version}-%{release}
+Requires:	ncurses-devel%{?_isa}
+
+%description static
+Static libraries for the LLVM compiler infrastructure.
+
+This package contains LLVM %{major_version} and can be installed
+in parallel with other LLVM versions.
+
+
+%prep
+%setup -q -n llvm-%{version}.src
+%patch0 -p1 -b .s390
+%patch1 -p1 -b .sphinx
+%patch2 -p1 -b .docs2
+%patch3 -p1 -b .docs3
+%patch4 -p1 -b .docs4
+%patch5 -p1 -b .lldbfix
+%patch6 -p1 -b .instdirs
+%patch7 -p0 -b .soversion
+%patch71 -p1 -b .julia1
+%patch72 -p1 -b .julia2
+%patch73 -p1 -b .julia3
+%patch74 -p1 -b .julia4
+%patch75 -p1 -b .julia5
+%patch76 -p1 -b .julia6
+%patch77 -p1 -b .julia7
+%patch78 -p1 -b .julia8
+%patch80 -p1 -b .julia9
+%patch81 -p1 -b .julia10
+%patch82 -p1 -b .julia11
+
+%build
+mkdir -p _build
+cd _build
+
+%ifarch s390
+# Decrease debuginfo verbosity to reduce memory consumption during final library linking
+%global optflags %(echo %{optflags} | sed 's/-g /-g1 /')
+%endif
+
+# force off shared libs as cmake macros turns it on.
+%cmake3 .. \
+	-DBUILD_SHARED_LIBS:BOOL=OFF \
+	-DCMAKE_BUILD_TYPE=RelWithDebInfo \
+	-DCMAKE_SHARED_LINKER_FLAGS="-Wl,-Bsymbolic -static-libstdc++" \
+%ifarch s390
+	-DCMAKE_C_FLAGS_RELWITHDEBINFO="%{optflags} -DNDEBUG" \
+	-DCMAKE_CXX_FLAGS_RELWITHDEBINFO="%{optflags} -DNDEBUG" \
+%endif
+	-DCMAKE_INSTALL_PREFIX=%{_libdir}/%{name} \
+	-DLLVM_TARGETS_TO_BUILD="X86;AMDGPU;PowerPC;NVPTX;SystemZ;AArch64;ARM;Mips;BPF" \
+	-DLLVM_ENABLE_LIBCXX:BOOL=OFF \
+	-DLLVM_ENABLE_ZLIB:BOOL=ON \
+	-DLLVM_ENABLE_FFI:BOOL=ON \
+	-DLLVM_ENABLE_RTTI:BOOL=ON \
+%if %{with gold}
+	-DLLVM_BINUTILS_INCDIR=%{_includedir} \
+%endif
+	\
+	-DLLVM_BUILD_RUNTIME:BOOL=ON \
+	\
+	-DLLVM_INCLUDE_TOOLS:BOOL=ON \
+	-DLLVM_BUILD_TOOLS:BOOL=ON \
+	\
+	-DLLVM_INCLUDE_TESTS:BOOL=ON \
+	-DLLVM_BUILD_TESTS:BOOL=ON \
+	\
+	-DLLVM_INCLUDE_EXAMPLES:BOOL=ON \
+	-DLLVM_BUILD_EXAMPLES:BOOL=OFF \
+	\
+	-DLLVM_INCLUDE_UTILS:BOOL=ON \
+	-DLLVM_INSTALL_UTILS:BOOL=OFF \
+	\
+	-DLLVM_INCLUDE_DOCS:BOOL=ON \
+	-DLLVM_BUILD_DOCS:BOOL=ON \
+	-DLLVM_ENABLE_SPHINX:BOOL=ON \
+	-DLLVM_ENABLE_DOXYGEN:BOOL=OFF \
+	-DSPHINX_OUTPUT_HTML:BOOL=OFF \
+	-DSPHINX_WARNINGS_AS_ERRORS:BOOL=OFF \
+	\
+	-DLLVM_BUILD_LLVM_DYLIB:BOOL=ON \
+	-DLLVM_DYLIB_EXPORT_ALL:BOOL=ON \
+	-DLLVM_LINK_LLVM_DYLIB:BOOL=ON \
+	-DLLVM_BUILD_EXTERNAL_COMPILER_RT:BOOL=ON \
+	-DLLVM_INSTALL_TOOLCHAIN_ONLY:BOOL=OFF \
+	\
+	-DSPHINX_EXECUTABLE=%{_bindir}/sphinx-build
+
+make %{?_smp_mflags}
+
+%install
+cd _build
+make install DESTDIR=%{buildroot}
+cd -
+
+# Move and symlink into FHS dirs
+mkdir -p %{buildroot}%{_bindir}
+for bin in %{buildroot}%{_libdir}/%{name}/bin/*
+do
+  # Cannot move llvm-config due to runtime prefix determination
+  [ ${bin##*/} = llvm-config ] && continue
+  # Already versioned binaries
+  if [ ${bin%%%{major_version}} != $bin ]
+  then
+    mv $bin %{buildroot}%{_bindir}/${bin##*/}
+    ln -s ../../../bin/${bin##*/} %{buildroot}%{_libdir}/%{name}/bin/${bin##*/}
+  else
+    # Unversioned binaries
+    if [ -L $bin ]
+    then
+      target=$(readlink $bin)
+      # Make the link point to the versioned binary if needed
+      [ ${target%%%{major_version}} == $target ] && ln -sf ${target}-%{major_version} $bin
+      [ $target == clang-%{major_version} ] && continue
+    fi
+    mv $bin %{buildroot}%{_bindir}/${bin##*/}-%{major_version}
+    ln -s ../../../bin/${bin##*/}-%{major_version} %{buildroot}%{_libdir}/%{name}/bin/${bin##*/}
+  fi
+done
+for dir in include
+do
+  mkdir -p %{buildroot}%{_prefix}/$dir/%{name}
+  mv %{buildroot}%{_libdir}/%{name}/$dir/* %{buildroot}%{_prefix}/$dir/%{name}/
+  rmdir %{buildroot}%{_libdir}/%{name}/$dir
+  ln -s ../../$dir/%{name} %{buildroot}%{_libdir}/%{name}/$dir
+done
+mkdir -p %{buildroot}%{_libdir}/cmake
+mv %{buildroot}%{_libdir}/%{name}/lib/cmake/llvm %{buildroot}%{_libdir}/cmake/%{name}
+ln -s ../../../cmake/%{name} %{buildroot}%{_libdir}/%{name}/lib/cmake/llvm
+mkdir -p %{buildroot}%{_datadir}
+mv %{buildroot}%{_libdir}/%{name}/share/man %{buildroot}%{_datadir}/
+
+# fix multi-lib
+mv -v %{buildroot}%{_includedir}/%{name}/llvm/Config/llvm-config{,-%{__isa_bits}}.h
+install -m 0644 %{SOURCE100} %{buildroot}%{_includedir}/%{name}/llvm/Config/llvm-config.h
+
+# Create ld.so.conf.d entry
+mkdir -p %{buildroot}%{_sysconfdir}/ld.so.conf.d
+cat >> %{buildroot}%{_sysconfdir}/ld.so.conf.d/%{name}-%{_arch}.conf << EOF
+%{_libdir}/%{name}/lib
+EOF
+
+# suffix mandir files with major version to avoid conflict with llvm
+for i in %{buildroot}%{_mandir}/man1/*; do
+       mv $i "${i%.*}-%{major_version}.1"
+done
+
+%check
+cd _build
+make V=1 check-all
+
+%post libs -p /sbin/ldconfig
+%postun libs -p /sbin/ldconfig
+
+%files
+%{_bindir}/*
+%dir %{_libdir}/%{name}/bin
+%{_libdir}/%{name}/bin/*
+%exclude %{_libdir}/%{name}/bin/llvm-config
+%{_mandir}/man1/*.1.*
+%exclude %{_mandir}/man1/llvm-config-%{major_version}.1.*
+%license LICENSE.TXT
+
+%files libs
+%config(noreplace) %{_sysconfdir}/ld.so.conf.d/%{name}-%{_arch}.conf
+%dir %{_libdir}/%{name}
+%dir %{_libdir}/%{name}/lib
+%{_libdir}/%{name}/lib/BugpointPasses.so
+%{_libdir}/%{name}/lib/LLVMHello.so
+%if %{with gold}
+%{_libdir}/%{name}/lib/LLVMgold.so
+%endif
+%{_libdir}/%{name}/lib/libLLVM-%{major_version}*.so
+%{_libdir}/%{name}/lib/libLTO.so.%{major_version}*
+%license LICENSE.TXT
+
+%files devel
+%{_libdir}/%{name}/bin/llvm-config
+%{_mandir}/man1/llvm-config-%{major_version}.1.*
+%{_includedir}/%{name}/llvm
+%{_includedir}/%{name}/llvm-c
+%{_libdir}/%{name}/include
+%{_libdir}/%{name}/lib/libLLVM.so
+%{_libdir}/%{name}/lib/libLTO.so
+%{_libdir}/cmake/%{name}
+%{_libdir}/%{name}/lib/cmake
+
+%files static
+%{_libdir}/%{name}/lib/*.a
+
+%changelog
+* Fri Feb 10 2017 Orion Poplawski <orion@cora.nwra.com> - 3.9.1-4
+- Add patch to add sonames to libraries
+- Make -static require ncurses-devel
+
+* Thu Feb 9 2017 Orion Poplawski <orion@cora.nwra.com> - 3.9.1-3
+- Install into libdir prefix
+
+* Mon Jan 2 2017 Milan Bouchet-Valat <nalimilan@club.fr> - 3.9.1-2
+- Add patches needed by Julia.
+- Disable Sphinx docs (which currently cause the build to fail on Rawhide).
+- Replace remaining spaces with tabs for consistency.
+- Add dependency on -devel for -static package.
+- Fix missing Requires(postun).
+- Fix links to unversioned llvm-ar.
+
+* Thu Dec 29 2016 Milan Bouchet-Valat <nalimilan@club.fr> - 3.9.1-1
+- Create versioned llvm3.9 package.
+
+* Tue Nov 29 2016 Josh Stone <jistone@redhat.com> - 3.9.0-7
+- Apply backports from rust-lang/llvm#55, #57
+
+* Tue Nov 01 2016 Dave Airlie <airlied@gmail.com - 3.9.0-6
+- rebuild for new arches
+
+* Wed Oct 26 2016 Dave Airlie <airlied@redhat.com> - 3.9.0-5
+- apply the patch from -4
+
+* Wed Oct 26 2016 Dave Airlie <airlied@redhat.com> - 3.9.0-4
+- add fix for lldb out-of-tree build
+
+* Mon Oct 17 2016 Josh Stone <jistone@redhat.com> - 3.9.0-3
+- Apply backports from rust-lang/llvm#47, #48, #53, #54
+
+* Sat Oct 15 2016 Josh Stone <jistone@redhat.com> - 3.9.0-2
+- Apply an InstCombine backport via rust-lang/llvm#51
+
+* Wed Sep 07 2016 Dave Airlie <airlied@redhat.com> - 3.9.0-1
+- llvm 3.9.0
+- upstream moved where cmake files are packaged.
+- upstream dropped CppBackend
+
+* Wed Jul 13 2016 Adam Jackson <ajax@redhat.com> - 3.8.1-1
+- llvm 3.8.1
+- Add mips target
+- Fix some shared library mispackaging
+
+* Tue Jun 07 2016 Jan Vcelak <jvcelak@fedoraproject.org> - 3.8.0-2
+- fix color support detection on terminal
+
+* Thu Mar 10 2016 Dave Airlie <airlied@redhat.com> 3.8.0-1
+- llvm 3.8.0 release
+
+* Wed Mar 09 2016 Dan Horák <dan[at][danny.cz> 3.8.0-0.3
+- install back memory consumption workaround for s390
+
+* Thu Mar 03 2016 Dave Airlie <airlied@redhat.com> 3.8.0-0.2
+- llvm 3.8.0 rc3 release
+
+* Fri Feb 19 2016 Dave Airlie <airlied@redhat.com> 3.8.0-0.1
+- llvm 3.8.0 rc2 release
+
+* Tue Feb 16 2016 Dan Horák <dan[at][danny.cz> 3.7.1-7
+- recognize s390 as SystemZ when configuring build
+
+* Sat Feb 13 2016 Dave Airlie <airlied@redhat.com> 3.7.1-6
+- export C++ API for mesa.
+
+* Sat Feb 13 2016 Dave Airlie <airlied@redhat.com> 3.7.1-5
+- reintroduce llvm-static, clang needs it currently.
+
+* Fri Feb 12 2016 Dave Airlie <airlied@redhat.com> 3.7.1-4
+- jump back to single llvm library, the split libs aren't working very well.
+
+* Fri Feb 05 2016 Dave Airlie <airlied@redhat.com> 3.7.1-3
+- add missing obsoletes (#1303497)
+
+* Thu Feb 04 2016 Fedora Release Engineering <releng@fedoraproject.org> - 3.7.1-2
+- Rebuilt for https://fedoraproject.org/wiki/Fedora_24_Mass_Rebuild
+
+* Thu Jan 07 2016 Jan Vcelak <jvcelak@fedoraproject.org> 3.7.1-1
+- new upstream release
+- enable gold linker
+
+* Wed Nov 04 2015 Jan Vcelak <jvcelak@fedoraproject.org> 3.7.0-100
+- fix Requires for subpackages on the main package
+
+* Tue Oct 06 2015 Jan Vcelak <jvcelak@fedoraproject.org> 3.7.0-100
+- initial version using cmake build system
diff --git a/sources b/sources
index e69de29..b107154 100644
--- a/sources
+++ b/sources
@@ -0,0 +1 @@
+SHA512 (llvm-3.9.1.src.tar.xz) = 50cbe8ee911080f586e77861c442348701bd02e2de0c090c54c34f82ac275ecfcd712af0f41e387c33b4a6057778a4258a27554292fe68ab4af3fd9dd6d90683