面试题, floor(sqrt(x))的几种方法
如题,x是32位unsigned integer.
有Quake公式,(n+1)**2 - n**2 = 2n + 1, 即1+3=4, 4+5=9, 9+7=16 ...
Here are the numbers from original Pentium, but these change dramatically in the later chips. mul 10 fmul 1 div 41 fdiv 39 Yes, floating multiply is much faster than an integer multiply. [link]
查表法: 65536的平方就溢出了。把0~65535这6万多个数的平方放在数组里。数组是有序的,可二分查找——找不到时low和high是啥情况呢?判断某个x是不是平方数可用STL的set(放水版)。
今天看了下Icarus Verilog的sample里的sqrt.vl, 不太懂,改了个python版,特意用了**和*而不是<<, +而不是|
x = 101 acc = 0; acc2 = 0 for bitl in range(15, -1, -1): bit = 2 ** bitl; bit2 = 2 ** (bitl * 2) guess = acc + bit guess2 = acc2 + bit2 + acc * (2 ** bitl) * 2 if guess2 <= x: acc = guess; acc2 = guess2 print(acc)
又想了想好像彻底懂了。它里面的提到的binary search是二进制查找,不是二分查找。用十进制也可以: 6789的平方根是90吗?90*90>6789, 所以十位不可能是9,因为91, 92...更大。十位是8吗?肯定是, 因为80*80都比6789小了,79, 68, 57之流更没戏。倒过来从1到9,10*10<6789不说明十位一定是1,因为19*19使出洪荒之力也不行。二进制每位除了1就是0,把一个5x10的循环变成16x1的循环。acc2 + bit2... 写成x**2 + 2xy + y**2更好懂。注释里本来写得就是acc2 + 2 * (acc< Verilog代码如下: 我把"printf"改了下,#100 value...和#5 clk = ~clk都改成了#1,好像没改坏。原始版上电后过100个timescale再说,每5个timescale 1个clock。想起来以前进机房要换拖鞋,“最烦你们这帮调中断的,pia-pia地把电脑都搞坏了!” 很遗憾上面那个变不成电路,合成不了,Logic synthesis失败?下面这个可以但我看不懂: 轻飘飘一句floor(sqrt()),有没有想过背后的tears与欢笑?:-) 这还没到浮点呢。/*
* Copyright (c) 1999 Stephen Williams (steve@icarus.com)
*
* This source code is free software; you can redistribute it
* and/or modify it in source code form under the terms of the GNU
* General Public License as published by the Free Software
* Foundation; either version 2 of the License, or (at your option)
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
*
* $Id: sqrt.vl,v 1.4 2004/10/04 01:10:56 steve Exp $"
*/
/*
* This example shows that Icarus Verilog can run non-trivial
* programs, too. This uses a variety of Verilog language features
* to implement the module of a square-root device. The program
* uses IEEE1364-1995 language features and should work correctly
* on any Verilog compiler.
*
* Run the file with Icarus Verilog under UNIX using the command:
*
* % iverilog -osqrt sqrt.v
* % ./sqrt
*/
/*
* This module approximates the square root of an unsigned 32bit
* number. The algorithm works by doing a bit-wise binary search.
* Starting from the most significant bit, the accumulated value
* tries to put a 1 in the bit position. If that makes the square
* to big for the input, the bit is left zero, otherwise it is set
* in the result. This continues for each bit, decreasing in
* significance, until all the bits are calculated or all the
* remaining bits are zero.
*
* Since the result is an integer, this function really calculates
* value of the expression:
*
* x = floor(sqrt(y))
*
* where sqrt(y) is the exact square root of y and floor(N) is the
* largest integer <= N.
*
* For 32bit numbers, this will never run more then 16 iterations,
* which amounts to 16 clocks.
*/
module sqrt32(clk, rdy, reset, x, .y(acc));
input clk;
output rdy;
input reset;
input [31:0] x;
output [15:0] acc;
// acc holds the accumulated result, and acc2 is the accumulated
// square of the accumulated result.
reg [15:0] acc;
reg [31:0] acc2;
// Keep track of which bit I'm working on.
reg [4:0] bitl;
wire [15:0] bit = 1 << bitl;
wire [31:0] bit2 = 1 << (bitl << 1);
// The output is ready when the bitl counter underflows.
wire rdy = bitl[4];
// guess holds the potential next values for acc, and guess2 holds
// the square of that guess. The guess2 calculation is a little bit
// subtle. The idea is that:
//
// guess2 = (acc + bit) * (acc + bit)
// = (acc * acc) + 2*acc*bit + bit*bit
// = acc2 + 2*acc*bit + bit2
// = acc2 + 2 * (acc<
/*
* Copyright (c) 2002 Stephen Williams (steve@icarus.com)
*
* This source code is free software; you can redistribute it
* and/or modify it in source code form under the terms of the GNU
* General Public License as published by the Free Software
* Foundation; either version 2 of the License, or (at your option)
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
*
* $Id: sqrt-virtex.v,v 1.4.2.2 2005/02/23 18:37:52 steve Exp $"
*/
/*
* This module is a synthesizeable square-root function. It is also a
* detailed example of how to target Xilinx Virtex parts using
* Icarus Verilog. In fact, for no particular reason other than to
* be excessively specific, I will step through the process of
* generating a design for a Spartan-II XC2S15-VQ100, and also how to
* generate a generic library part for larger Virtex designs.
*
* In addition to Icarus Verilog, you will need implementation
* software from Xilinx. As of this writing, this example was tested
* with Foundation 4.2i, but it should work the same with ISE and
* Webpack software.
*
* This example source contains all the Verilog needed to do
* everything described below. We use conditional compilation to
* select the bits of Verilog that are needed to perform each specific
* task.
*
* SIMULATE THE DESIGN
*
* This source file includes a simulation test bench. To compile the
* program to include this test bench, use the command line:
*
* iverilog -DSIMULATE=1 -oa.out sqrt-virtex.v
*
* This generates the file "a.out" that can then be executed with the
* command:
*
* vvp a.out
*
* This causes the simulation to run a long set of example sqrt
* calculations. Each result is checked by the test bench to assure
* that the result is valid. When it is done, the program prints
* "PASSED" and finishes the simulation.
*
* When you take a close look at the "main" module below, you will see
* that it uses Verilog constructs that are not synthesizeable. This
* is fine, as we will never try to synthesize it.
*
* LIBRARY PARTS
*
* One can use the sqrt32 module to generate an EDIF file suitable for
* use as a library part. This part can be imported to the Xilinx
* schematic editor, then placed like any other pre-existing
* macro. One can also pass the generated EDIF as a precompiled macro
* that other designers may use as they see fit.
*
* To make an EDIF file from the sqrt32 module, execute the command:
*
* iverilog -osqrt32.edf -tfpga -parch=virtex sqrt-virtex.v
*
* The -parch=virtex tells the code generator to generate code for the
* virtex architecture family (we don't yet care what specific part)
* and the -osqrt32.edf places the output into the file
* sqrt32.edf.
*
* Without any preprocessor directives, the only module is the sqrt32
* module, so sqrt32 is compiled as the root. The ports of the module
* are automatically made into ports of the sqrt32.edf netlist, and
* the contents of the sqrt32 module are connected approprately.
*
* COMPLETE CHIP DESIGNS
*
* To make a complete chip design, there are other bits that need to
* be accounted for. Signals must be assigned to pins, and some
* special devices may need to be created. We also want to write into
* the EDIF file complete part information so that the implementation
* tools know how to route the complete design. The command to compile
* for our target part is:
*
* iverilog -ochip.edf -tfpga \
* -parch=virtex -ppart=XC2S15-VQ100 \
* -DMAKE_CHIP=1 sqrt-virtex.v
*
* This command uses the "chip" module as the root. This module in
* turn has ports that are destined to be the pins of the completed
* part. The -ppart= option gives complete part information, that is
* in turn written into the EDIF file. This saves us the drudgery of
* repeating that part number for later commands.
*
* The next steps involve Xilinx software, and to talk to Xilinx
* software, the netlist must be in the form of an "ngd" file, a
* binary netlist format. The command:
*
* ngdbuild chip.edf chip.ngd
*
* does the trick. The input to ngdbuild is the chip.edf file created
* by Icarus Verilog, and the output is the chip.ngd file that the
* implementation tools may read. From this point, it is best to refer
* to Xilinx documentation for the software you are using, but the
* quick summary is:
*
* map -o map.ncd chip.ngd
* par -w map.ncd chip.ncd
*
* The result of this sequence of commands is the chip.ncd file that
* is ready to be viewed by FPGA Edit, or converted to a bit stream,
* or whatever.
*
* POST MAP SIMULATION
*
* Warm fuzzies are good, and retesting your design after the part
* is mapped by the Xilinx backend tools is a cheap source of fuzzies.
* The command to make a Verilog file out of the mapped design is:
*
* ngd2ver chip.ngd chip_root.v
*
* This command creates from the chip.ngd the file "chip_root.v" that
* contains Verilog code that simulates the mapped design. This output
* Verilog has the single root module "chip_root", which came from the
* name of the root module when we were making the EDIF file in the
* first place. The module has ports named just like the ports of the
* chip_root module below.
*
* The generated Verilog uses the library in the directory
* $(XILINX)/verilog/src/simprims. This directory comes with the ISE
* WebPACK installation that you are using. Icarus Verilog is able to
* simulate using that library.
*
* To compile a post-map simulation of the chip_root.v, use the
* command:
*
* iverilog -DSIMULATE -DPOST_MAP -ob.out \
* -y $(XILINX)/verilog/src/simprims \
* sqrt-virtex.v chip_root.v \
* $(XILINX)/verilog/src/glbl.v
*
* This command line generates b.out from the source files
* sqrt-virtex.v and chip_root.v (the latter from ngd2ver)
* and the "-y